Extending a 140-character conversation with David Sanger and Sam Bloomberg-Rissman about editing, filtering and the business opportunity for middlemen in stock photography.

@tdavidson @tdavidson via @microstock Hate it or understand it, Fotolia takes the free photos strategy up a notch with PhotoXpress: http://bit.ly/photoxpress (link)
@davidsanger @davidsanger @tdavidson stock photography without editing cannot scale. Imagine a million Eiffel Tower pix in a search result. (link)
@tdavidson @tdavidson @davidsanger editing or filtering? fighting to limit supply is a losing battle. #stockphotography (link)
@sambr @sambr @davidsanger @tdavidson does stock photography need editing or a google type image rank? The cream rises to the top (link)
@tdavidson @tdavidson @sambr exactly; better filtering is future, not editing. key: create infrastructure to match costs with benefits (contributors & middlemen) (link)
@davidsanger @davidsanger @tdavidson #stockphotography filters need data, alamy has keywords & clickthru data www.bit.ly/6stzN | google has less www.bit.ly/82LPi (link)
@davidsanger @davidsanger @sambr @tdavidson even image rank can’t scale to billions of images. authority has to come from trusted sources (link)
@tdavidson @tdavidson @davidsanger @sambr #stockphotography “authority” will come from the masses, not the few. (link)
@davidsanger @davidsanger @tdavidson @sambr issue is authority, relevance, trust. Consider the math. How to find a good image in smugmug or smugmug x 10000? (link)
@tdavidson @tdavidson @davidsanger re: filters need data: but that’s an opp for innovation. google & alamy’s current data shapes, but doesn’t define, the future. (link)
@tdavidson @tdavidson @davidsanger @sambr #stockphotography algorithms, filters to define relevance, authority will *have* to scale; better math is the opp :) (link)
@tdavidson @tdavidson @davidsanger @sambr basically, the industry pressures aren’t going away; attempting to limit creation and distribution is pointless. (link)
@davidsanger @davidsanger @tdavidson @sambr authority combined with trustworthiness = relevance in #stockphotography (link)
@davidsanger @davidsanger @tdavidson @sambr I don’t think the issue is limiting creativity but finding what’s good. a new opening for photo researchers (link)
@tdavidson @tdavidson @davidsanger @sambr great point David; an opening for a new kind of photo researcher, a better algorithm + person in #stockphotography (link)
@sambr @sambr @tdavidson @davidsanger there are tools that need to be developed. flickr has interestingness, alamy has rank but both are clearly wanting (link)

Looking back, I made a mistake in initially assuming “editing” meant suppressing supply, a mistake of interpreting the full meaning behind a 140-character message. The issue in stock photography is how to connect buyers and sellers in a world where demand has shifted and the market is saturated with images (granted, quantity, not necessarily quality); the traditional middlemen have struggled to create an efficient market to match up demand and supply in this new environment.

The trends in the photography business mirror the larger issues around content on the web; I won’t belabor the point except to point out that this problem is an opportunity for a new kind of middleman to create better filters AND editors.

The joy of a spontaneous conversation…

Hello, I'm Taylor Davidson.
I'm an early-stage VC and a photographer. If you liked this post, please subscribe to this blog. For more like this, check out the archives, and follow me on Twitter @tdavidson.
  • http://hyperbio.net/ Leila Boujnane

    I think your middleman is actually a middlewoman!
    The future of editing (whatever it means) is actually search. Better search tools. Not better editing tools.

  • http://www.taylordavidson.com/writing/ Taylor Davidson

    The eternal problem with gender-specific terms to describe business, shame on me.

    But I agree; whether we call it editing, filtering or search is unimportant (at least to justify my own regular butchering of language); the supply isn't going away, so we're going to have to find better ways to find what we want.

  • http://twitter.com/corkran Lee Corkran

    ranking is WRONG. The key is better search and display. Don't think of a paginated display. Think lighttable! Begin with brightqube.com

  • http://twitter.com/corkran Lee Corkran

    Simple tools a la kayak.com. Quantified quality. information visualization: overview, refine, detail on demand. SHOW the images to the user

  • http://www.taylordavidson.com/writing/ Taylor Davidson

    If by “ranking” you mean the few selecting for the masses, then I'll agree.

    But “ranking” works if its a reflection of the masses' aggregated decisions: that's what PageRank essentially does.

    “Ranking” implies a way of picking out content to display, but doesn't specify how to display it (meaning, ranking doesn't imply paginated display, even if that's how most systems do it, a failure in my opinion); for the sake of “ranking” images displaying them for people is a necessity.

    One last note: “ranking” need not define good / bad; it may define more / less popular, more / less relevant, more / less used, more / less unique; in that sense ranking does help people make decisions more efficiently.

  • http://twitter.com/sambr sam b-r

    I think Taylor has it spot on in that last comment. What we need is a better way for consumers of images to find what they need.

    Every person looking for images is going to have a specific need. Are they photobuyers for a national ad campaign? Is it Aunt Mae looking for a nice image of a cute kitten for the background of her phone? What about the tourist about to go to Rome?

    Each one wants a different type of picture and is going to need a different way to find it. But none of them want to spend months wading through the millions and millions of pictures that are floating out there at the moment.

    There needs to be a search/sort/whathaveyou that allows the consumer of images to find the image they are looking for with ease. It's going to take smarter people than me to come up with the algorithms to do this. However, I feel there is going to be a need for human filtering as well. I bet Leila has a much better grasp of the nuts and bolts of this than I do.

  • http://twitter.com/corkran Lee Corkran

    Yep. totally agree with you here. The user is the definer of the”rank” which means their inputs create what the “rank” is according to their desires, and those rankings are displayed in that priority. This is the goal we (BrightQube) are pursuing, which is counter to those sites that “rank” the returns accordind to the owner's desires, not the user's desires, and this is mainly done because most users don't go past the first few pages. That is the flaw of perpetuating a paginated display of ranked results for creative visual content. There is no one right match, it is in the eye of the beholder…if they get a chance to see it.

  • http://www.davidsanger.com David Sanger

    I think it is essential to separate search from presentation. Based on some input search criteria, a search engine calculates a result of matching objects which could comprise zero, one up to billions of objects.

    That result set has to be ordered in some way, if only to determine what to show first. This is because the human mind can only see so many results at once.

    Google does it by relevance, based on comparing page content and links, and perhaps user search history, with the request.

    It's a good question whether such an automated process would give the best results for photography.

    TechCrunch reports the top six photo sites alone have over 50 billion unique images http://bit.ly/Vy8IV

    Getty traditionally has edited their collection to limit the number of images presented and ensure that each is thoroughly vetted and licensable. From only a few million images a search for the keyword China returns only 43,000 images, which are then roughly presented by recency

    Alamy chose to forgo editing and only spot check images for technical competency. They have tried by algorithmic means to determine relevance to order the result set. From 16+ million images a search for China returns 293,000 images, which are presented roughly in order of the popularity of the photographer according to click through and sales data. Even so photo buyers complain of lots of noise, and many good images are effectively buried.

    One reason for this is that the data is fairly sparse. There are so many images that many never get seen at all, let alone get clicked on or sold. And this is in a system which is fully aware of the entire transaction right through to the sale.

    In order to effectively order an answer set, a search engine needs data and there is no guarantee with extremely huge datasets that there will be enough data to rank with any significance. To the extent that this is the case, the relevance ordering will tend to become arbitrary.

    Aside from keywords and connections, social ranking http://bit.ly/17UjqZ can consider a whole range of input form users. Flickr's interestingness and user ratings etc. makes use of this. A search for China on Flickr returns over 4 million images and one of the presentation options is to present t he most interesting first. This is effectively substituting the editorial opinion of the community, at least as to the images which have been seen.

    That is why I made the comment about authority and trustworthiness. In order to improve the chances of making a satisfactory and timely choice it may well be more cost-effective for a user/photo buyer to rely on the expertise, experience and prior knowledge of a trusted researcher or a collection prepared by one.

  • http://www.davidsanger.com David Sanger

    Brightqube still picks only the top 8000 images to show the user, and puts the most “relevant” ones in the middle.

  • http://www.taylordavidson.com/writing/ Taylor Davidson

    Right now the models may be very different, but I expect them to grow closer over time. I guess the real question in my mind is if a search + “curated” (and by curated, I point out Flickr's interestingness metric) model ends up being very similar to PageRank, since pageviews and links are proxies for authority, trustworthiness and popularity.

    Your points around the lack of good data for search engines to use are very valid; right now the best option for buyers may be through trusted researchers and sources, but as search gets better the options for buyers will grow and the role of the gatekeepers will change.

    If you asked me today to find an image for something I would try a mixture of Google and stock agency databases, but I would also ask particular photographers that I knew and perhaps put out requests on twitter, blog, etc. for certain types of images (and btw, I've helped my Mom do this, as she regularly buys images for a commercial framing shop).

    Ask me the same question in 5 years and I may do it differently :)

    Right now editing, filtering, curating, ranking and search et al. may describe different actions and results, but in 5-10-15 years there may be little difference between how they accomplish and achieve the same basic function.

  • http://twitter.com/corkran Lee Corkran

    Hi David, yes, you're right. BrightQube initially displays “only the top 8000″ images at the moment. It is an arbitrary cut off number we launched with based on processing overhead. But all of those “only 8000″ are visible at once, not buried behind 30 or 40 pages never to be seen. Search returns that have less than “only 8000″ results are fully displayed.

    The most “relevant” matches are based on keyword, title and caption information supplied by the contributor. They are weighted and matched to the keyword(s) entered by the user. Further refinement of “relevant” matching can be controlled in Advanced selection using simple sliders for price, size, etc, and other aspects of the collection. We are working on expanding these slider features considerably to allow the user to winnow and refine what is most “relevant” to them. None of the results are weighted/favored by sales, commissions, popularity, or other biases inherent in other sites.

  • http://www.davidsanger.com David Sanger

    My original point was not that stock photography search does not work, but that it does not scale.

    The visual display of BrightQube certainly is attractive but it has to do exactly the same as any other image search engine, decide what to show the user first. My screen doesn't show 8000 images at once, only the central 153 (on my laptop) ; beyond that I have to scroll.

    Yet BQ is a very small library – 3 million or so you say. Scale up to 300 billion images and the system breaks down. Keywords can help a user “drill down” but if a search for “Golden Gate Bridge, Night, Vertical” cheap, large returns 150,000 images then the game is lost.

    That is the scale issue

  • http://twitter.com/corkran Lee Corkran

    Any system has an inherent single point of failure. In this case it is the size of your laptop screen. If you had a larger monitor, then our solution is the closest that would scale to meet your needs. That you can view 153 images at once on your laptop is impressive already. You can then pan and zoom, just like an online map, to get greater/lesser results displayed.

    BrightQube employs information visualization: Overview, refine, detail on demand. It is the closest so far to online image display that can scale effectively. This is only the beginning base of the platform (Beta, if you will).

    And you've refined in your example already down to 150,000. We can handle that. We cannot control your end display size, though. The need to display 300Billion isn't necessary.

    We can certainly scale to handle 300billion. That is not the issue; it is only technology. And adding further smart search filtering is planned to exploit multiple axes of metadata, geographic clustering of relevance along metadata axis points, etc.

    At some point, though, there is a diminishing return to solve a problem (seeing 300billion) that doesn't exist in realistic user intent or behavior, only existing in the rhetorical discourse.

  • http://twitter.com/corkran Lee Corkran

    Also, I should add, that I began BrightQube as a project, because I hadn't seen any effective online solution yet that dealt with search and display, and that in my mind the answers to this lay in metadata and information visualization techniques, and fluid capabilities of rich internet applications.

    I also began it because, as a pro photographer, if my pictures are not found on the first 3 pages of a site, they'll never if rarely be seen at all. And not seen means not sold. That is why I am fundamentally opposed to the “authority” approach to editing, or the hierarchical ranking based on biases imposed by the site owner and not imposed by the user doing the search. It's a personal and it's business.

    Being a bootstrapped start-up, we've focused on making the buyer get to the right image in the quickest amount of time. While we've still a long way to go with further refinement features, user uploading capability, etc, this project has borne fruit in that by and large, the buyer's we've attracted love the site for it's visual abundance and choice and speed of selection, saving them time.

  • http://www.davidsanger.com David Sanger

    Lee – it has nothing to do with the laptop or a desktop version. Brightqube does allow you to seem somewhat more images that a list or matrix presentation, true.

    But once you get a very very large number of images all stock image searches have the same problem: there is no real way to tell them apart and present the most relevant ones in the result set.

    If you have 150,000 image of the same subject with the exact same keywords, then it is not humanly possible to review them all in any medium, and a photo buyer will not want to.

    My point is that we cannot always assume there is distinguishing detailed data for determining relevance, without something akin to editing, the determination of quality.

  • corkran

    Ironically, this is exactly what I was referring to when I said your laptop monitor size was the point of failure. Check this out
    http://www.techcrunch.com/2009/06/08/apples-coo…

  • http://www.taylordavidson.com/writing/ Taylor Davidson

    Isn't the “single point of failure” the human mind's ability to simultaneously process and rank a large number of images, rather than the size of the display? Simply displaying more images doesn't solve the problem of relevance.

  • http://www.taylordavidson.com/writing/ Taylor Davidson

    Completely agreed that “not seen = not sold”. And since I'm a nobody, it's hard to be seen; and that's part of the reason I don't actively try to sell stock, even though I am an Alamy contributor.

    Thus I understand the opposition to authority approach to editing. For example, I'm steadfastly on the record as disliking Getty's Flickr deal.

    And I do love the way Brightcube displays results :)

    As I mentioned before, user-defined context + search = highly relevant results.

    Buyers get great results (highly relevant, the right image, the quickest) at the intersection of user-defined context + powerful search.

    Which points out 2 problems: 1) lack of rich, standardized image metadata, and 2) difficulty for buyers to really input their needs into search engines. Bad data in, bad matching algorithms, bad data out.

    The method of display helps, but doesn't really change that.

    Until search improves, we're stuck with authority-based systems because it's the cheapest interaction method. Once search improves, interaction costs go down, the “authority of the few” will be replaced by the “authority of the many”. This isn't a new dynamic.

    They key here is the difference between now and the future: right now, better display of results is an approach to solving the problem of finding relevant results. But in the future, we will need to combine display and search, and that's where the innovation will come.

  • http://www.davidsanger.com David Sanger

    Again Lee it has nothing to do with the screen. One screen or scrolling screen , a photobuyer does not want to have to sort through 15K images without any other distinguishing data. They will pay someone to do it for them,

    It is even more obvious with video.

    The human mind cannot scan thousands of video clips.

  • http://www.davidsanger.com David Sanger

    Taylor – I think there's a more subtle aspect to this as well.

    There are distinctively different kinds of search.

    1. A buyer can be looking for a specific image they already have, like with Tineye.

    2. A buyer can be looking for a specific image they know about, like “that one with a migrant woman and some kids…” or the “French shot of a guy jumping over a puddle”. They can enter text, or draw a likeness on a tablet.

    3. A buyer can be looking for images “like” another one.

    4. A buyer can be looking for images of a very specific subject, place and situation. “two kids on Maracas Beach in Trinidad, with palms on the left, and copy space on the right”

    5. A buyer can be browsing for a theme or concept. “something that shows strength and challenge, humorous, slightly blue tones.”

    6. A buyer can be browsing for ideas. “single shot to show essence of the New China”

    Each of these is in fact a different kind of search with different challenges in determining if images are relevant.

  • http://twitter.com/corkran Lee Corkran

    This is where we differ. I do believe the human eye and creative mind can swiftly scan through enormous volumes of images to find what they are looking for. I've witnessed it and have done so myself. I'm sure you have, too. How can someone realistically outsource their creative aesthetic and intent, short of a vulcan mind-meld? Unless they are not very creative, I guess, and just want their creative choices handed to them, in which case it is akin to secretarial duty, much like booking a flight for an executive (hmmm, there again, I'd prefer to book my own flight).

  • http://twitter.com/corkran Lee Corkran

    And here I agree, as you have outlined the “distinguishing data” that provides the “context” for the search that relevance and display can respond to in helping the user. And within each of these there are further pivot points that emphasis the relevance further.

  • http://www.zymmetrical.com/ Paul Melcher

    I think the display result of Bing.com is the answer. No more page, but still limited visuals in one page. no need to worry about screensize anymore.
    Check it out a bing.com under an image search.
    _Paul Melcher_

  • corkran

    Hi Paul, Bing is fresh and thoughtful, to be sure, but conventionally efficient, and far from the answer, in my opinion if we're really talking about true image search. Its approach to use of screen space seems like “all the news that's print to fit,” rather than taking advantage of ways to make the web technologies advantage the user. I think it's application is too restrictive and limited, almost as if it reinforces their authority in that you don't need to see what they choose not to show.

  • http://www.sambr.com/ sambr

    Taylor makes a great point above about how we need to have good data to start with. We have the images. And we have tools like Tineye to match specific images, but unless I am providing the correct metadata to my images (and ensuring that the metadata is not stripped from the file), search engines can not find my images. I can have great ways of displaying, but that necessary information of describing the image is a step we must all work harder to implement.

    My beautiful picture of a sunrise over Shanghai, signifying the New China, may be great, but if it doesn't have information linking it to China, few search engines will find it. TinEye is making steps in the direction to solve this.

    We need tools to allow us to fill in the information on our images easily and we need search engines that can successfully use that information to find the images. After that we can argue about displaying 150 images or 8000 images.

  • corkran

    Concur. Good points throughout.

  • http://www.davidsanger.com David Sanger

    Sam – it would be very nice as a start if the search engines (including Tineye) even looked at metadata. None do that I am aware of.

    Google studiously refuses to look inside the file and Tineye is only concerned with the image itself and does purely visual matches,

    Of course specialized stock photography sites all rely on embedded metadata, albeit with different standards of usage.

  • http://www.davidsanger.com David Sanger

    The continuous scrolling of Bing is not new. CoolIris already provides it for Getty Images, Google Yahoo, Flick and other Cooliris-enabled sites.

    Even so, they all limit the number of images you can see. Google Images wil only show you 1000 images, Getty and Flickr only 4000, BrightQube as noted only shows 8000

  • http://www.taylordavidson.com/writing/2009/08/11/the-misplaced-lament-for-the-dying-field-of-photojournalism/ The misplaced lament for the “dying field of photojournalism”. | Taylor Davidson

    [...] Agreed; there are significant opportunities in creating better ways to search for and find meaningful, relevant images. [...]

MORE: Financial Models for Entrepreneurs