Gnutella 和 Freenet代表着真正的技术创新
by Andy Oram
05/12/2000
http://www.oreillynet.com/pub/a/network/2000/05/12/magazine/gnutella.html
The computer technologies that have incurred the most condemnation recently -- Napster, Gnutella, and Freenet -- are also the most interesting from a technological standpoint. I'm not saying this to be perverse. I have examined these systems' architecture and protocols, and I find them to be fascinating. Freenet emerged from a bona fide, academically solid research project, and all three sites are worth serious attention from anyone interested in the future of the Internet.
近一段时间有些计算机技术如Napster, Gnutella, and Freenet引起的非议最多,但是从技术的观点来看的话,也可说是最让人感兴趣。我并不是要有意歪曲才这么讲。通过检查这些系统的架构和协议,我发现它们的确很吸引人。Freenet源自于一个实实在在学术性研究项目,这3个站点值得任何一个对互联网的未来关心的人认真的关注。
In writing this essay, I want to take the hype and hysteria out of current reports about Gnutella and Freenet so the Internet community can evaluate them on their merits. This is a largely technical article; I address the policy debates directly in a companion article, The Value of Gnutella and Freenet. I will not cover Napster here because its operation has received more press. It's covered in "Napster: Popular Program Raises Devilish Issues" by Erik Nilsson, and frankly, it is less interesting and far-reaching technically than the other two systems.
在本文中,我想撇开目前一些有关Gnutella and Freenet的报导中的广告说辞和歇斯底里,以便互连联网社区的人们能够基于它们的真正优点上做出评价。本文大体上是一篇技术文章;我在另一篇文章“The Value of Gnutella and Freenet”中进行了政策性的论述。由于媒体对Napster作了更多的报导,本文中不对其讨论。关于Napster你可以参考Erik Nilsson写“Napster: Popular Program Raises Devilish Issues”,在技术上另两个系统比Napster更有趣,更深远。
In essence, Gnutella and Freenet represent a new step in distributed information systems. Each is a system for searching for information; each returns information without telling you where it came from. They are innovative in the areas of distributed information storage, information retrieval, and network architecture. But they differ significantly in both goals and implementation, so I'll examine them separately from this point on.
从本质上讲,Gnutella and Freenet代表着分布式信息系统发展上新的一步。二者都是搜索信息的系统,都返回你需要的信息但有不告诉你信息的出处。它们代表着分布式的信息存储,信息获取和网络体系领域中的创新。但二者在目标和实现上有极大的不同,下面对他们分别进行考察。
Gnutella basics Gnutella
Each piece of Gnutella software is both a server and a client in one, because it supports bidirectional information transfer. The Gnutella developers call the software a "servent," but since that term looks odd I'll stick to "client." You can be a fully functional Gnutella site by installing any of several available clients; lots of different operating systems are supported. Next you have to find a few sites that are willing to communicate with you: some may be friends, while others may be advertised Gnutella sites. People with large computers and high bandwidth will encourage many others to connect to them.
基础
任一个Gnutella软件既是一个server又是一个client,因为它支持双向信息传递。Gnutella开发人员把这种软件叫做“servent”,但因这个词看起来古怪,我坚持叫它"client"。你可以通过安装现有的几种client中的任何一种而建成一个全功能的Gnutella网站;很多不同的操作系统都被支持。下一步,你必须找到一些愿意跟你通讯的站点:有些可能是你的朋友的,另一些可能是被广告了的Gnutella站点。拥有大型计算机和高速带宽的人们可以鼓励许多人跟他们连接。
Evil or Just Controversial?:
Open Source software such as Gnutella and Freeware are spreading as quickly as a virus. But are they really so unhealthy? Andy Oram points out the advantages--and disadvantages--of controversial technologies in this week's edition of Platform Independent on Web Review.
是罪恶还是仅仅有争议
诸如Gnutella and Freeware一样的开放源软件正在向病毒一样的快速传播。但它们真的是如此有害吗?Andy Oram在本周的一期Platform Independent on Web Review上指出了有争议技术的优点和缺点。
You will communicate directly only with the handful of sites you've agreed to contact. Any material of interest to other sites will pass along from one site to another in store-and-forward fashion. Does this sound familiar, all you grizzled, old UUCP and Fidonet users out there? The architecture is essentially the same as those unruly, interconnected systems that succeeded in passing Net News and e-mail around the world for decades before the Internet became popular.
你只能直接与你同意接触的少数几个站点通信。别的站点关心的资料将以存储转发的方式从一个站点传到另一个站点地传递下去。这对头发已斑白的UUCP和Fidonet老用户来说听起来是否似曾相识?这种架构本质上,与在因特网变得流行以前的数十年里在全世界传递网上新闻和电子邮件的系统是一样的,这些系统无拘无束地连接在一起。
But there are some important differences. Because Gnutella runs over the Internet, you can connect directly with someone who's geographically far away just as easily as with your neighbor. This introduces robustness and makes the system virtually failsafe, as we'll see in a minute.
但是有一些重要的不同。因为Gnutella是运行在整个因特网上的,你可以直接与某个地理位置上离得很远的人相连,就象连到你的邻居上一样。这使得系统牢固并几乎不出故障,这一点我们马上就要谈到。
Second, the protocol for obtaining information over Gnutella is a kind of call-and-response that's more complex than simply pushing news or e-mail. Figure 1 shows the operation of the protocol. Suppose site A asks site B for data matching "MP3." After passing back anything that might be of interest, site B passes the request on to its colleague at site C -- but unlike mail or news, site B keeps a record that site A has made the request. If site C has something matching the request, it gives the information to site B, which remembers that it is meant for site A and passes it through to that site.
再一点,通过Gnutella获取信息的规程是一种调用-响应(call-and-response)方式,它比单纯地推送新闻或电子邮件要复杂。Figure 1表示了这种规程的操作。假定站点A向站点B要求匹配"MP3"的数据。在传送回一些可能相关的东东以后,站点B把这个请求又传给它的同事站点C,但与邮件和新闻不同的是,站点B保留了站点A的这个请求记录。如果站点C有同这个请求匹配的数据,它就把这些数据送给站点B,而站点B知道这是为A找的数据所以把数据传给A。
Figure 1. How Gnutella retrieves information I am tempted to rush on and describe the great significance of this simple system, but I'll pause to answer a few questions for those who are curious.
我很想快速地描述出这个简单系统的重要特性,但为了满足好奇者我先回答一些问题。
1.. How are requests kept separate?
Each request has a unique number, generated from random numbers or semi-randomly from something unique to the originating site like an Ethernet MAC address. If a request goes through site C on to site D and then to site B, site B can recognize from the identifier that it's been seen already and quietly drop the repeat request. On the other hand, different sites can request the same material and have their requests satisfied because each has a unique identifier. Each site lets requests time out, simply by placing them on a queue of a predetermined size and letting old requests drop off the bottom as new ones are added.
1,如何识别请求
每个请求都有一个唯一的号码,这个号码是由源站点上独一无二的一个东西比如以太网MAC地址所产生的随机数或准随几数。如果一个请求经由站点C传到站点D然后又传到站点B,站点B从标识上认出这是已经见过的请求,从而不声不响地把它仍掉。另一方面,不同的站点可以请求同样的资料,并且都能获得满足因为他们都有自己唯一的标识。每个站点都可以让请求过期,只需把请求放入一个预定义了大小的队列中,当新的请求要加进队列时最底下的老的请求则被仍掉。
2.. What form does the returned data take?
It could be an entire file of music or other requested material, but Gnutella is not limited to shipping around files. The return could just as well be a URL, or anything else that could be of value. Thus, people are likely to use Gnutella for sophisticated searches, ending up with a URL just as they would with a traditional search engine. (More on this exciting possibility later.)
2,返回的数据采用什么形式
2可以是一个音乐文件或其他请求的材料,但Gnutella不仅仅限于传文件。返回结果可能是一个URL或其他任何一个值。因此,人们很可能用Gnutella来作复杂的搜索,而最终得到一个URL,就象使用一个传统搜索引擎一样。(关于这点在后面将作更多讨论。)
3.. What protocol is used?
Gnutella runs over HTTP (a sign of Gnutella's simplicity). A major advantage of using HTTP is that two sites can communicate even if one is behind a typical organization's firewall, assuming that this firewall allows traffic out to standard Web servers on port 80. There is a slight difficulty if a client behind a firewall is asked to serve up a file, but it can get by the firewall by issuing an output command called GIV to port 80 on its correspondent. The only show-stopper comes when a firewall screens out all Web traffic, or when both correspondents are behind typical firewalls. Gnutella
3,使用何种协议
运行在HTTP协议上(这是Gnutella的简洁性体现之一)。使用HTTP的有利之处在于即使两个站点中的一个处在企业的防火墙的后面时它们也能通信,前提是这个防火墙允许数据流向外面的端口为80的标准Web服务器。如果在防火墙后面的client被要求伺服一个文件时会有一点困难,但它可以通过发一条叫GIV的输出命令到通信对方的80端口而瞒过防火墙。当防火墙屏蔽了所有的Web交通或通信双方都处在典型的防火墙后面时,才会出现“此页无法显示”。
4.. How does the system stop searching?
Like IP packets, each Gnutella request has a time-to-live, which is normally decremented by each site until it reaches zero. A site can also drastically reduce a time-to-live that it decides is ridiculously high. As we will see in a moment, the time-to-live limits the reach of each site, but that can be a benefit as well as a limitation.
4,系统怎样停止搜索?
就象IP包文,每个Gnutella请求都有一个存活期(time-to-live),每经过一个站点通常这个值都要减1,直至变为0。如果一个站点发现某个请求的存活期值异常的大,那它可以大幅度降低这个存活期。我们马上可以看到,虽然存活期限制了每个站点所能触及的范围,但这同时也带来了好处。
5.. How is a search string like "MP3" interpreted?
That is the $64,000 question, and leads us to Gnutella's greatest contribution. The Holy Grail: searching for dynamically generated data Gnutella is a fairly simple protocol. It defines only how a string is passed from one site to another, not how each site interprets the string. One site might handle the string by simply running fgrep on a bunch of files, while another might insert it into an SQL query, and yet another might assume that it's a set of Japanese words and return rough English equivalents, which the original requester may then use for further searching. This flexibility allows each site to contribute to a distributed search in the most sophisticated way it can. Would it be pompous to suggest that Gnutella could become the medium through which search engines operate in the 21st century?
5,象"MP3"一样的搜索字符串是被怎么解释的?
这是个值64000美元的问题,它让我们发现Gnutella最伟大的贡献。
神圣的创举:搜索动态生成的数据 Gnutella是一个相当简单的协议。它只定义了一个字符串如何从一个站点传到另一个站点,而没有定义如何解释这个字符串。某个站点可能只是用fgrep程序对一堆文件检索这个字符串,而另外某个站点可能把这个串插入一个SQL查询中,别的站点可能认为这是一个日语单词串,从而返回一个意义大约近似的英语串,以便原始请求的发出站点能使用这个英语串进行进一步搜索。这种灵活性允许每个站点以自己擅长的最复杂的方式来为分布式搜索作贡献。建议让Gnutella成为21世纪的搜索引擎运作的平台,这是否过份了?
Status of Gnutella Gnutella现状 Gnutella was started by a division of America Online called Nullsoft. America Online cut off support when it heard about the project, afraid of its potential use for copyright infringement. But a programmer named Brian Mayland reverse engineered the protocol and started a new project to develop clients. None of the developers of current software have looked at code from Nullsoft. Gnutella is an open source project with clients registered under the GNU License.
Gnutella项目最初是由美国在线的一个叫做Nullsoft的部门开始的。美国在线因为害怕在使用这个系统后可能会导致版权侵害问题,便取消了对该项目的支持。但一个名叫Brian Mayland的程序员对这个协议进行了反向工程,开始了一个新的项目来开发client(客户端软件)。没有一个当前版本的开发者看过从Nullsoft来的代码。Gnutella是一个基于GNU许可的开放源项目。 Limitations and risks of Gnutella Gnutella的限制和风险
Early experiments with Gnutella suggest it is efficient and useful, but has problems scaling. If you send out a request with a time-to-live of 10, for instance, and each site contacts six other sites, up to 106 or 1 million messages could be exchanged.
早期对Gnutella的实验表明它是高效的和有用的,但在系统的伸缩性存在问题。假如你送出一个存活期为10的请求,并且每个站点都与另6个站点联系,那么交换的总消息数可高达106或1百万。
The exponential spread of requests opens up the most likely source of disruption: denial-of-service attacks caused by flooding the system with requests. The developers have no solution at present, but suggest that clients keep track of the frequency of requests so that they can recognize bursts and refuse further contact with offending nodes.
呈指数型传播的请求引入了最可能使系统崩溃的原因:由洪水般的请求引发的拒绝服务型攻击。开发者们目前没有解决这个问题的办法,但是建议client跟踪请求的出现频率以便识别出是否有突发性表现从而断绝同攻击性节点继续通信。
Furthermore, the time-to-live imposes a horizon on each user. I may repeatedly search a few hundred sites near me, but I will never find files stored a step beyond my horizon. In practice, information may still get around. After all, Europeans in the Middle Ages enjoyed spices from China even though they knew nothing except the vaguest myths about China. All they had to know was some sites in Asia Minor, who traded with sites in Central Asia, who traded with China.
此外,存活期给每个用户限制了眼界。我可能重复地搜索着我附近的几百个节点,但我从不会找到一个在离我的眼界只有一步之遥的站点上的文件。实际上,信息可能还会移动。总之,中世纪的欧洲人尽管除了最模糊的神话之外对中国一无所知,但他们却喜欢来自中国的香料。他们所必须知道的只是在小亚细亚的几个交易地,而这几个交易地则与在中亚的与中国交易的城市交易。
Spencer Kimball, a developer of the Linux client for Gnutella, says this subnetting can serve to protect Gnutella from attack. Gnutella has already suffered service disruptions, mostly because of bugs in clients, and in the future it is certain to be attacked with vicious and sophisticated attempts to bring it down. While some groups of sites have slowed down temporarily or become severed from other groups, the system has never actually come down.
一个为Gnutella开发Linux系统上的client的程序员Spencer Kimball说,这种子网化方式可以保护Gnutella免遭攻击。实际上Gnutella已经历过服务崩溃,主要起因是client程序中的错误。可以肯定将来会出现恶意和复杂的攻击。虽然有几组站点曾有过暂时的缓慢或者其他的组的情形变得更严峻,但系统从未真正地停止过。
People may misuse Gnutella for other reasons besides denial of service, of course. One site was recently reported to use it for a sting: The site advertised file names that appeared to offer child pornography, then logged the IP address and domain name of every download request. The reason such information was available is that Gnutella uses HTTP; there is no difference between the user information Gnutella offers and that offered by any Web browser.
当然除了拒绝服务以外,人们可能还会以其他理由来不正当使用Gnutella。最近有报导说某站点使用Gnutella做了一个钉子:这个站点上登出了一些看似提供儿童色情的文件名,然后把每次到这个站点下载文件的用户的IP地址和域名都记录了下来。之所以能利用这种信息是因为Gnutella使用的是HTTP;Gnutella提供的用户信息与任何Web浏览器提供的没有什么区别。
A final limitation of Gnutella worth mentioning is the difficulty authenticating the source of the data returned. You really have no idea where the data came from -- but that's true of e-mail and news right now too. Clients don't have to choose anonymity; they can identify themselves as strongly as they want. If a Gnutella client chooses to return a URL, that's just as trustworthy as a URL retrieved in any other manner. If a digital signature infrastructure becomes widespread, clients could use that too. I examine reliability and related policy issues in the article The Value of Gnutella and Freenet.
最后一个值得一提的有关Gnutella的限制是难于对返回数据的来源进行认证。你的确不知道数据来自哪里--而现在的e-mail和news也是这样。Client没有必要选择匿名;他们完全可以按照自己的愿望来标识自己。如果Gnutella选择返回一个URL,那和以任何其他方式获得的URL一样的值得信赖。如果数字署名基础设施变得普及了,Client也同样可以使用。我在文章“The Value of Gnutella and Freenet”中论及了可靠性和相关的政策性问题。
Freenet basics Freenet
The goals of Gnutella and Freenet are very different. Those of Freenet are more explicitly socio-political and, to many people, deliciously subversive: a.. To allow people to distribute material anonymously. b.. To allow people to retrieve material anonymously. c.. To make the removal of material almost insuperably difficult. d.. To operate without central control. The latter feature characterizes both Freenet and Gnutella, and differentiates them from Napster. A court order can shut down Napster (and any mirror site), but shutting down Freenet or Gnutella would be just as hard as prosecuting all those 317,000 Internet users who allegedly exchanged Metallica songs.
基础
Gnutella和Freenet的目标很不同。对很多人来说Freenet更多的具有明显的政治性的,颠覆性目的。 a-- 允许人们匿名地散发材料。 b-- 允许人们匿名地获取材料。 c-- 使得删除材料几乎不可能 d-- 运作上没有集中的控制 后半部份特色Gnutella和Freenet都具有,这使得它们不同于Napster。一个法院的命令可以关闭Napster(和它的所有镜象站点),但要关闭Gnutella和Freenet就象要起诉所有声称在交换着Metallica歌曲的317000因特网用户一样的困难。
Another technical goal of Freenet proves particularly interesting: it spreads data randomly among sites, where the data can appear and disappear unpredictably. In addition to serving the social goals listed above, Freenet offers an intriguing possible solution to the problem of Internet congestion, because popular information automatically propagates to many sites.
Freenet的另一个技术目标特别有趣:它把数据随几的分布在各站点上,数据的出现和消失是不可预测的。除了满足上面所提到的社会目标以外,Freenet还可能为因特网上的拥挤提供了一个迷人的解决方案,因为人气高的信息自动地传播到了许多的站点上。
Freenet bears no relation to the community networks with similar names of the 1980s and early 1990s. It grew out of a research project launched in 1997 by Ian Clarke at the Division of Informatics at the University of Edinburgh. He has made a paper from that project available online. (Warning: It's a PDF and I had trouble both viewing and printing it from a couple different PDF viewers.)
Freenet跟80年代和90年代早期具有相似名字的社区网络没有任何关系。它起源于1997年由爱丁堡大学信息学部的Ian Clarke发起的一个研究项目。他在本项目中写的一篇论文可以在网上得到。(注意:这是一个PDF文档,我在使用好几个不同的PDF浏览器进行阅读和打印时都遇到了困难。)
The Freenet architecture and protocol is similar to Gnutella in many ways. Each cooperating person downloads a client and sends requests to a few other known clients. Requests are uniquely marked, are handed from one site to another, are temporarily stored on a stack so that data can be returned, and are dropped after each one's time-to-live expires.
在很多方面Freenet的体系结构和协议与Gnutella很相似。每个合作者都下载一个client并向几个其他的client发送请求。请求都有唯一的标识,被从一个站点传到另一个站点,都被临时存放在一个堆栈里直到数据被返回,当存活期结束后就被抛弃。
The game of find-the-data
The main difference between the two systems is that when a Freenet client satisfies a request, it passes the entire data to the requester. This is an option in Gnutella but is not required. Even more important, as the data passes back along a chain of Freenet clients to the original requester, each client keeps a copy (unless it is a huge amount of data and the client decides that keeping it is not worth the disk space). The client keeps the data so long as other people keep asking for it, but discards the data after some period of time when no one seems to want it.
查找数据的游戏
两个系统之间最大的不同在于当Freenet的client满足一个请求时,它把整个数据都传送给请求的发出者。这在Gnutella中是可选的,但不是必须的。更重要的是,当数据沿着Freenet的client链被传送到最初的请求者的过程中,链上的每个client都可以保留一个拷贝(除非数据非常巨大,而client认为不值得为之花费那么多磁盘空间来保存它)。只要有人不停地要求这个数据,Client就会一直保留它,但当似乎没有人在需要它的时候就可以仍掉。
What is accomplished by this practice, apparently so inefficient compared to the Internet? Ian Clarke tells me it is not all that inefficient -- for large amounts of data its efficiency is comparable to that of the Web -- and that in fact it accomplishes quite a number of things:
跟因特网比起来显得低效的这种实践,到底取得了什么样的成就呢?Ian Clarke告诉我它并不一定如我所想象的那么低效--对于大数据量来来说,它的效率可与WEB匹敌--并且事实上它完成了相当不错的几件事情:
a.. It allows the transience required to meet the goals of anonymity and persistence.
a,允许短暂地达到匿名性和持续性的目标。
b.. It lets small sites distribute large, popular documents without suffering bandwidth problems. You don't have run out and get a 16-processor UltraSPARC, or rent space on someone else's, just because you put out an exciting video that lots of people want to download.
b,让小站点发布很大的人气高的文档而不必为网络带宽犯愁。你不必仅仅因为公开了一个令人兴奋的录象想让好多人来下载,就跑出去找一个16cpu的UltraSPARC,或从别人那而租空间。
c.. It rewards popular material and allows unpopular material to disappear quietly. In this regard, Freenet is definitely different from the Eternity Service (a model proposed a few years ago but never implemented). Its goal is not to proliferate any kind of garbage people want, but to prevent material from being taken down if a lot of people think it's valuable.
c,它鼓励受欢迎的材料而让不人气的材料静静地消失。在这一点上,Freenet是与Eternity服务(几年前被提出来的一种模型,但从为被实现过)根本不同的。它的目标不是散布人们想要的垃圾,而是防止许多人都认为有价值的材料被撤掉。
d.. It tends to bring data close to those who want it. (Like Gnutella, "closeness" in Freenet has no geographical meaning, but refers only to the number of hops between Internet sites.) This is because the first request from node A to node B may have to pass through many other nodes, but the second and subsequent requests can be satisfied by node B directly. Furthermore, nodes A and B are likely to be one or two hops away from many nodes operated by similar people who like the same kinds of content. All those people will be pleased to find the content coming back quickly after the first request is satisfied.
d,它倾向于把数据弄到那些想要他们的人邻近的地方。(和Gnutella一样,"邻近"一词在Freenet中没有地理上的意义,只意味着因特网站点之间的跳跃数。)这是因为从节点A发到节点B的第一个请求可能要被传递到很多别的节点,而第二个以及以后的请求就可以直接由节点B来满足。此外,节点A和节点B很可能离那些同样喜欢这种内容的人所运行的节点只有1到2个跳跃数。所有的人都会很惊喜地发现在第一个请求得到满足后,后面的内容很快就能得到。
The last item is particularly interesting architecturally, because the popularity of each site's material causes the Freenet system to actually alter its topology. When a site discovers that it is getting a lot of material routinely from one of its partners, it tends to favor that partner for future requests. Bandwidth increases where it benefits the end users. Building on my Europe/China Silk Road analogy, Clarke says, "Freenet is like bringing China closer to Europe as more and more Europeans ask to trade with it."
最后的一条从架构上来讲特别有趣,因为每个站点的材料的人气度会导致Freenet系统真正改变它的拓扑结构。当一个站点发现它经常从它的一个合作者那儿获得大量材料以后,对未来的请求它会倾向于选择这个合作者。带宽因此增加而受益的是最终用户。鉴于我作的欧洲/中国丝绸之路的比方,Clarke说:“Freenet就如同把中国带到离欧洲更近的地方,因为越来越多的欧洲人要求与之进行贸易。”
Other unique features of Freenet Freenet is more restrained in the traffic generated than Gnutella, perhaps because it expects to transfer a complete file of data for each successful request. When a Freenet client receives a request it cannot satisfy, it sends the request on to a single peer; it does not multicast to all peers as Gnutella does. If the client receives a failure notice because no further systems are known down the line, or if the client fails to get a response because the time-to-live timed out, it tries another one of its peers. In brief, searching is done depth-first and not in parallel. Nevertheless, Clarke says searches are reasonably fast; each takes a couple seconds as with a traditional search engine. The simple caching system used in Freenet also seems to produce just as good results as the more deliberate caching used by ISPs for Web pages.
Freenet的其他特徵
Freenet比Gnutella更受网络交通的制约,也许这是因为对每一个成功的请求它都希望传送一个完整的文件的缘故。当一个Freenet的client接到一个它满足不了的请求时,它把该请求只传递给一个对等站点;它并不象Gnutella那样把请求广播给所有的对等站点。如果因为请求沿着传递链下去再也找不到站点而得到一个失败通知,或者因为存活期到期而使client没有收到响应,client就会另一个对等站点。简而言之,搜索是按深度优先方式进行,而不是按并行方式。然而,Clarke说搜索的速度可谓相当的快;象传统的搜索引擎一样每个搜索都花上几秒钟。Freenet中采用的简单快取系统不比ISP为了存取网页而采用的精致的快取系统逊色。
Freenet is being developed in Java and requires the Java Runtime Environment to run. It uses its own port and protocol, rather than running over HTTP as Gnutella does.
Freenet正在用Java开发,并且需在Java Runtime Environment上运行。它使用自己的端口和协议,而不象Gnutella那样在HTTP上运作。
Limitations and risks of Freenet
Freenet seems more scalable than Gnutella. One would imagine that it could be impaired by flooding with irrelevant material (writing a script that dumped the contents of your 8-gig disk into it once every hour, for instance) but that kind of attack actually has little impact. So long as nobody asks for material, it doesn't go anywhere.
Freenet的限制和风险
Freenet似乎比Gnutella更具有伸缩性。也许有人会认为可以通过发送洪水般的无关的资料而使整个系统遭到破坏(例如,可以写一个脚本,让它每小时把你的硬盘中的内容都抛出去一次),但实际上这种攻击几乎没有什么影响。只要没有人要求这种资料,它不会去任何地方。
Furthermore, once someone puts up material, no one can overwrite it with a bogus replacement. Each item is identified by a unique identifier. If a malicious censor tries to put up his own material with the same identifier, the system checks for an existing version and says, "We already have the material!" The only effect is to make the original material stay up longer, because a request for it was made by the would-be censor.
此外,一旦某人提交了一个资料,没有人可以用一个伪造品来替代它。每个条目都被一个唯一的标识符所标识。如果一个怀有恶意的检查员试图用同样的标识符来提交他自己的资料,系统将会检查到现有的版本,并说:“我们已经有这个材料!”唯一的效果是使得这个材料保留的时间更长,因为这个自充为检查员的人为它提出了一个请求。
The searching problem
The unique identifier is Freenet's current weak point. Although someone posting material can assign any string as an identifier, Freenet chooses for security reasons to hash the string. Two search strings that differ by a single character (like "HumanRights" and Human-Rights") will hash to very different values, as with any hashing algorithm. This means that a prosecuting agency that is trying to locate offending material will have great difficulty identifying the material from a casual scan of each site.
搜索问题
唯一的标识符是Freenet目前的弱点。尽管有人可以在提交一个材料时赋予它任何字符串作为标识符,为了安全的考虑Freenet会将这个字符串散列化处理。两个只相差一个字符的串(比如"HumanRights"与"Human-Rights")将会被散列成极为不同的值,不管用哪种散列算法。这意味着当某个检察机关试图从一个随意的检索中找出对自己不利的材料时将会遇到很大的困难。
But the hashing also renders Freenet unusable for random searches. If you know exactly what you're looking for on Freenet -- because someone has used another channel to say, for instance, "Search for HumanRights on Freenet" -- you can achieve success. But you can't do free text searches.
然而散列化也使得Freenet对于作随机检索显得无能。如果你准确知道你要在Freenet上所查找的东西-- 比如有人告诉你“在Freenet上搜索HumanRights” --你一定能获得成功。可是你不能作自由文本检索。
One intriguing use of Freenet is to offer sites with hyperlinks. Take people interested in bird-watching as an example. Just as an avid aviarist can offer a Web page with links to all kinds of Web sites, she can offer links that generate Freenet requests using known strings that retrieve data about birds over Freenet. Already, for people who want to try out Freenet without installing the client, a gateway to the Web exists under the name fproxy.
Freenet的一个迷人的用法是为站点提供超链接。让我们拿喜欢观察野鸟的人们来作一个例子。就象一个热切的飞禽饲养家可以提供一个链接到许多WEB网站的网页一样,她也可以提供产生Freenet请求的链接,这些请求使用可以在Freenet上获取有关鸟类数据的已知的字符串。对那些没有安装client而又想试一试Freenet的人来说,已经有一个名为fproxy的到WEB的网关可以使用了。
Another area for research is a client that accepts a string and changes it slightly in the hope of producing a more accurate string, then passes it on. The most important task in the Freenet project currently, according to Clarke, is to resolve the search problem.
另一个研究领域是开发一个client,它接受一个串并且为了得到一个更精确一点的串而对之做轻微的改动,再把它传递下去。按照Clarke的说法,目前Freenet project中最重要的一个任务就是解决搜索问题。 Letting go 不要管它
Once again, I refer readers to The Value of Gnutella and Freenet for a discussion of these systems' policy and social implications. I'll end this technical article by suggesting that the Gnutella and Freenet continue to loosen the virtual from the physical, a theme that characterizes network evolution. DNS decoupled names from physical systems; URNs will allow users to retrieve documents without domain names; virtual hosting and replicated servers change the one-to-one relationship of names to systems. Perhaps it is time for another major conceptual leap, where we let go of the notion of location. Welcome to the Heisenberg Principle, as applied to the Internet. Information just became free.
我再一次建议读者参考文章The Value of Gnutella and Freenet中有关这些系统的政策和社会影响的讨论。在结束本文之际,我建议Gnutella和Freenet这两个系统继续朝着把实质的从物理的释放的方向发展,因为这标志着网络发展的一个主旋律。DNS把名字同物理系统解脱开了;URNs将允许用户不用域名就能存取文档;虚拟主机和复制服务器技术改变了名字到系统的一对一的关系。我们不要理睬位置的概念了,也许是该有另一个大步地概念性突破的时候了。欢迎。。。信息正变的自由可得。
Gnutella and Freenet, in different ways, make the location of documents irrelevant; the search string becomes the location. To achieve this goal, they add a new layer of routing on top of the familiar routing done at the IP level. The new layer may appear at first to introduce numerous problems in efficiency and scaling, but in practice these turn out to be negligible or least tolerable. I think readers should take a close look at these systems; even if Gnutella and Freenet themselves turn out not to be good enough solutions for a new Internet era, they'll teach us some lessons when it's time for yet another leap.
Gnutella和Freenet以不同的方式,使文档的位置无关紧要;搜索的串变成了位置。为了实现这一点,二者在我们所熟悉的IP层次的路由功能上增加了一个新的路由层。新的层次看起来可能会在效率与伸缩性上引入许多问题,但实际上都是可被忽略的或可以最小容忍的。我认为读者应该好好地看一看这些系统;即使Gnutella 和 Freenet最终可能不一定是新的因特网时代的足够好的解决方案,当下一个跳跃到来时它们将会教给我们一些经验教训。
Andy Oram is an editor at O'Reilly & Associates specializing in books on Linux and programming. Most recently, he edited Peer-to-Peer: Harnessing the Power of Disruptive Technologies.
http://chinafreenet.50megs.com