URL decay at year 20: A research note
Fatih Oguz
Department of Library and Information Studies, The University of North Carolina at Greensboro, Greensboro, NC, 27412 USA
Search for more papers by this authorWallace Koehler
Department of Library and Information Studies, Valdosta State University, Valdosta, GA, 31698 USA
Search for more papers by this authorFatih Oguz
Department of Library and Information Studies, The University of North Carolina at Greensboro, Greensboro, NC, 27412 USA
Search for more papers by this authorWallace Koehler
Department of Library and Information Studies, Valdosta State University, Valdosta, GA, 31698 USA
Search for more papers by this authorAbstract
All text is ephemeral. Some texts are more ephemeral than others. The web has proved to be among the most ephemeral and changing of information vehicles. The research note revisits Koehler's original data set after about 20 years since it was first collected. By late 2013, the number of URLs responding to a query had fallen to 1.6% of the original sample. A query of the 6 remaining URLs in February 2015 showed only 2 still responding.
References
- Baeza-Yates, R.A., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison-Wesley Longman Publishing.
- Bar-Ilan, J., & Peritz, B.C. (2009). The lifespan of “informetrics” on the web: An eight year study (1998–2006). Scientometrics, 79(1), 7–25. http://doi.org/10.1007/s11192-009-0401-7
- Cho, J., & Garcia-Molina, H. (2000). The evolution of the web and implications for an incremental crawler. In Proceedings of the 26th International Conference on Very Large Data Bases (pp. 200–209). San Francisco: Morgan Kaufmann Publishers. Retrieved from: http://dl.acm.org/citation.cfm?id=645926.671679
- Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., & Berners-Lee, T. (1999). Hypertext transfer protocol—HTTP/1.1 (Internet RFC). Retrieved from: http://www.hjp.at/doc/rfc/rfc2616.html
- Goh, D.H.-L., & Ng, P.K. (2007). Link decay in leading information science journals. Journal of the American Society for Information Science and Technology, 58(1), 15–24. http://doi.org/10.1002/asi.20513
- Hennessey, J., & Ge, S.X. (2013). A cross disciplinary study of link decay and the effectiveness of mitigation techniques. BMC Bioinformatics, 14(Suppl. 14), S5. http://doi.org/10.1186/1471-2105-14-S14-S5
- Internet Assigned Numbers Authority (IANA). (2014). Hypertext Transfer Protocol (HTTP) Status Code Registry. Retrieved from: http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml
- Koehler, W.C. (1997). Web Site And Web Page Persistence And Change: A Longitudinal Study (MS Thesis). University of Tennessee, Knoxville. Retrieved from: http://www.sis.utk.edu/thesis/wallace-conrad-koehler-jr
- Koehler, W.C. (1999a). An analysis of web page and web site constancy and permanence. Journal of the American Society for Information Science, 50(2), 162–180. http://doi.org/10.1002/(SICI)1097-4571(1999)50:2<162::AID-ASI7>3.0.CO;2-B
- Koehler, W.C. (1999b). Classifying Web sites and Web pages the use of metrics and URL characteristics as markers. Journal of Librarianship and Information Science, 31(1), 21–31. http://doi.org/10.1177/096100069903100103
- Koehler, W.C. (1999c). Digital libraries and World Wide Web sites and page persistence. Information Research, 4(4). Retrieved from: http://www.informationr.net/ir/4-4/paper60.html
- Koehler, W.C. (2002). Web page change and persistence—A four-year longitudinal study. Journal of the American Society for Information Science and Technology, 53(2), 162–171. http://doi.org/10.1002/asi.10018
- Koehler, W.C. (2004). A longitudinal study of web pages continued: A consideration of document persistence. Information Research, 9(2). Retrieved from http://www.informationr.net/ir/9-2/paper174.html
- Markwell, J., & Brooks, D.W. (2003). “Link rot” limits the usefulness of Web-based educational materials in biochemistry and molecular biology. Biochemistry and Molecular Biology Education: A Bimonthly Publication of the International Union of Biochemistry and Molecular Biology, 31(1), 69–72. http://doi.org/10.1002/bmb.2003.494031010165
- Nelson, M.L., & Allen, B.D. (2002). Object persistence and availability in digital libraries. D-Lib Magazine, 8(1). http://doi.org/10.1045/january2002-nelson
10.1045/january2002-nelson Google Scholar
- Oguz, F., & Koehler, W.C. (2011). Document constancy and persistence: A study of web pages in library and information science domain. In Proceedings of the 74rd ASIS&T Annual Meeting (Vol. 48, pp. 1–9). New Orleans, LA. Retrieved from: http://www.asis.org/asist2011/proceedings/
- Payne, N., & Thelwall, M. (2007). A longitudinal study of academic webs: Growth and stabilisation. Scientometrics, 71(3), 523–539. http://doi.org/10.1007/s11192-007-1695-y
- Payne, N., & Thelwall, M. (2008a). Do academic link types change over time? Journal of Documentation, 64(5), 707–720.
- Payne, N., & Thelwall, M. (2008b). Longitudinal trends in academic web links. Journal of Information Science, 34(1), 3–14. http://doi.org/10.1177/0165551507079417
- Rhodes, S. (2010). Breaking down link rot: The Chesapeake project legal information archive's examination of URL stability. Law Library Journal, 102(4), 581–597.
- Rumsey, M. (2002). Runaway train: Problems of permanence, accessibility, and stability in the use of web sources in law review citations. Law Library Journal, 94(1), 27–39.
- Russell, E., & Kane, J. (2008). The missing link: Assessing the reliability of internet citations in history journals. Technology and Culture, 49(2), 420–429.
- Strader, C.R., & Hamill, F.D. (2007). Rotten but not forgotten. The Serials Librarian, 53(1–2), 163–177. http://doi.org/10.1300/J123v53n01_13
10.1300/J123v53n01_13 Google Scholar
- Taylor, M.K., & Hudson, D. (2000). “Linkrot” and the usefulness of web site bibliographies. Reference & User Services Quarterly, 39(3), 273–277.
- Tyler, D.C., & McNeil, B. (2003). Librarians and link rot: A comparative analysis with some methodological considerations. Portal: Libraries and the Academy, 3(4), 615–632. http://doi.org/10.1353/pla.2003.0098
- Wagner, C., Gebremichael, M.D., Taylor, M.K., & Soltys, M.J. (2009). Disappearing act: Decay of uniform resource locators in health care management journals. Journal of the Medical Library Association: JMLA, 97(2), 122–130. http://doi.org/10.3163/1536-5050.97.2.009
- Wren, J.D. (2004). 404 not found: The stability and persistence of URLs published in MEDLINE. Bioinformatics (Oxford, UK), 20(5), 668–672. http://doi.org/10.1093/bioinformatics/btg465