Interactive Media readings for Oct 3



8 thoughts on “Interactive Media readings for Oct 3

  1. See Who’s Editing Wikipedia: The article on Virgil Griffith’s project to scan and track the addresses of people and corporations editing their wikipedia entries was very interesting. It shows the capabilities of “scraping” websites and using their data to fuel new and innovative ideas. It also keeps companies transparent and keeps them honest to a certain degree.

    Should Web Giants Let Startups…: This article was also a fun one to read as I have never heard of Listpic before and didn’t know of his battle with Craigslist. This story shows how one idea that uses data from another website that was meant to be helpful can actually backfire. This article shows the importance of API’s and how working with companies can actually befit you though the amount fo data you can take may be reduced. I can understand where Craiglist was coming from with their cease ad desist letter as Ryan Sit did not even contact the company before proceeding to scrape their data. And because Craigslist is an ad free site and Ryan made money off of adds, he was practically spitting in their face.

    How Google Crawls: I had watched a video previously explaining how google searches the web and the article definitely helped to refresh some of this information for me. It’s amazing how many factors (200+) that they take into account when indexing websites and which ones to show when you search for specific words. The web really is becoming an open, unified creation that anyone (with working internet connection) can enjoy.

  2. For the Wikipedia article this guy created a program that could trace any changes made to any wikipedia article back to the user who made the changes. He exposes these changes so that experts can make sure they are accurate. Thus, making wikipedia a little more reliable because you would be able to tell what is fact and not.
    In the article Data Wars it discusses the affect scraping as on the web. Scraping data can be useful to the customers and users but the companies may not like this so much because they consider the information on their web site to be theres. They have the say so as what to do with it. But the argument is that even though companies own the web sites the information belongs to the users. Some companies embrace scrapping while other companies see it as invading.
    The Googlebot article discusses how googlebot works and how your site can come up on there database. It does this through crawling or looking up updated pages. It also indexes through tags and attributes. Finally, it uses serving results with page rank.

  3. Wikipedia is a cultural phenomenon in my opinion. It is a resource of user compiled data that allows for checks and balances by anonymous users to ensure validity. Though this may be highly exploitable as John Borland explains in his article. He saw the uniqueness in Wikipedias content and since they stored all there records allowed them to be searched. He has found activity from the Whitehouse to the CIA showing they take an interest in editing/composing there own content for there own personal interests. All of the information is available to the public it was just a matter of creating a way to navigate through it which borland did through his service. I find it very interesting that wikipedia is such a large website that a project would be made just to skim off the top all of the information flowing out of the site. This reminds me of the project Owen did to extract data from (I think)

    The Data Wars argument is very effective in explaining what scraping is, automatically harvesting information from another site and using the results, and giving great examples of it. An example was given of the website listpic which from reading sounds like an unbelievably great idea and I instantly thought wow this guy could become rich of this, sure enough in a short span of time his site received success both in traffic and advertising. This is undisputable evidence of the power of scraping information from specific websites on the internet. However craigslist took this personally even though the listings were available to be viewed by anyone thus could be used by anyone. Apparently not, as craigslist shut down the working parts of his service destroying what couldve been a great way to browse craigslist without the clumsiness of craigslist.

    These articles are really inspiring to me. Coming into this class I just had experience with HTML & CSS, what interest me most about programming is the power that you can exert from lines of code. on the internet HTML and CSS are like gunpowder. PHP seems to be more like Military Grade C4. I look forward to learning more about the language so maybe one day I could scrape a website for my own purposes.

  4. I really enjoyed the article about scraping. I had no idea what scraping or API’s were until I read this. Its definitely helping the internet become a better place for the common consumer, but for companies, it may be an annoyance.

  5. All three articles proved to be very interesting, and informative reads. The idea of the Wikipedia Scanner, and the ability to get to the source of any anonymous edit or change made to a Wikipedia document gives the website a little more creditability, which makes me feel better. The Data Wars article helped remind me that everyone has opinions and they should be respected. Even though Listpic’s intention was to enhance the Craigslist experience, which it did, Listpic was doing so with the aid of ads, something that Craigslist opposed. Finding out that searching on Google means surfing Google’s index of the web, and not the actual web itself was kinda mind blowing. I now have a better understanding of what goes on online.

  6. The article about scraping from large internet companies and databases made some good points on the advantages and disadvantages. It is unfortunate that information, though available to us to see and analyze, just isn’t always as easy to utilize for our own projects. Sit, who had gotten in trouble with Craigslist, simply was trying to find a better way to view content. I could see Craigslist, being a well known site, taking this idea and running with it–appropriating it as they wish.
    The article on wikipedia, gave some interesting insight on the process of having accurate data, which is always a concern on an open forum encyclopedia source.
    As far as google basics go, this highlighted how google works, which is very complex, yet so expedient. I also enjoyed the video Jason posted.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s