@part(approaches, root "text") @section(Approaches to Naming)@label(approaches) The nature of a distributed system, in which servers and objects are placed at diverse locations, makes the design of a uniform and efficient name-handling mechanism a difficult problem. One difficulty consists in choosing where name mapping and interpretation is to be done. We identify and contrast two possible solutions below. @subsection(Two Models) In one model, a logically centralized @i[name server] provides name mapping as a service. This @i[centralized] model is motivated primarily by the considerations of uniformity and minimal duplication of function mentioned above. Several distributed system designs@Cite(NEED82,OPPEN81,WATS81) have identified @i[naming] as a service in this way and provided a distinguished @i[name server] to perform this service. Ideally, every server, object, and service in such a system is registered with the name server, and clients present the registered names to the name server when referring to these entities. The second, @i[distributed] model stores the names with the objects themselves. This approach is motivated by considerations of efficiency, reliability, and extensibility. By itself, this approach does not seem to provide a full solution to the naming problem, since it is not clear in general how to find an object given its name, if the name is stored with the object. There are many hybrid approaches possible. For example, a logically centralized name server can be given a distributed implementation, in which definitions of names are placed close (in terms of communication costs) to the named objects. The basic model is the same, but distributed techniques are used to increase efficiency. In contrast, we have chosen to explore a hybrid based on the distributed model, with centralized techiniques used only when absolutely necessary to provide the needed functionality. Some of our reasons for taking this approach are indicated in the comparison below. @subsection(Comparison) While the centralized approach has the advantage of localizing the name-handling operations to one server and thus imposing some level of uniformity on the system, there are several advantages to the distributed approach as well. @b[Efficiency.] Separating the name of an object from its implementation introduces the extra cost of interacting with one more server_the name server_every time a name is referenced. Caching the name in the client would introduce inconsistency problems and only benefit the few applications that reuse names. Because of this cost, there are few name servers that implement file names separate from a file server, even though the name servers implement names for many other system entities such as hosts and users. @b[Consistency.] Separating the naming implementation from the implementation of the named entity makes it more difficult to ensure the name server's information is kept consistent with the objects being named. For example, deleting a named object requires notifying the name server that its name for the object is invalid. If one of the servers crashes during the operation, the system will be left inconsistent unless deletion as performed as a multi-server atomic transaction. Such solutions to the consistency problem reduce the efficiency of using name servers. Alternatively, many servers and client programs must be prepared to deal with inconsistency in the name service. @b[Fewer levels of naming.] If objects and their names are kept together, mapping from a name to its associated object is an internal operation for the server that maintains both. A name server, on the other hand, cannot map a name to an entity, but only to another name that can be used directly with the server implementing the entity. Thus, an additional level of naming is required between the name server and other system servers. A common design is to use low-level globally unique identifiers (e.g., 48-bit values), with the view that such identifiers are efficient to communicate and manipulate. We hold the view that unique identifiers are a generalization of identifiers used internally in, for example, file system implementations. Making such identifiers externally visible, and requiring them to be of a uniform format and globally unique, either imposes a uniform scheme of internal naming on all servers, or forces the unique identifiers to be treated as purely external names which are mapped by each server to the lower-level identifiers it uses internally. @b[Extensibility.] A distributed system typically includes numerous different @i[established] name spaces, name-handling servers, and interpretations. For example, the names for mailboxes, such as ``cheriton@@su-score.ARPA,'' may be imposed by standards established outside of the system in question. Such preexisting servers fit well into a model in which names are normally interpreted by the server providing the named objects, but are difficult to accommodate in a system using a name server that translates all names into low-level universal identifiers. @b[Reliability:] If an object's name is stored with the object, the name will always be accessible if the object itself is accessible. A name server, on the other hand, represents a central failure point, and its failure can cause a situation in which objects existing at locations where there have been no failures are inaccessible because they cannot be named. Although the distributed model offers a number of advantages, it also has some drawbacks. The distributed model works best in the case where a class of objects is implemented by a server, and the objects are stored near the server. Files provided by a storage server are an excellent example; it is convenient to store file names in directory files on the same storage medium as the files they name, and to implement the naming within the storage server. As another example, servers that provide a small number of transient objects_for instance, virtual terminal servers_can store names and attributes of the objects in memory. Extending the distributed model to cover the entire space of named entities in a distributed system is more difficult. It is advantageous for each server to provide name mapping for the objects it implements, for the reasons described above, but it is not clear how to define names for the servers themselves. One possibility is to provide one or more name servers to map from server names to addresses or other low-level identifiers for servers. This partially centralized method shares several of the drawbacks of fully centralized naming schemes. Another method would be to have each server store its own name. A name mapping request could then be broadcast or multicast to a group of servers, and each server would compare the specified name with its own name. This technique introduces an additional cost, in that each server in the group receives many requests that are not directed to it, and must spend some processing time in examining and discarding them. There are also potential problems of consistency_some care is required to prevent two non-identical servers from storing identical names for themselves. @subsection(The V Naming Model) In the V naming model, we have combined aspects of the centralized and distributed models using the technique of @i[distributed name interpretation]. Names may have more than one component (i.e., they may be hierarchical), and different components may be interpreted by different servers. For example, the first component may be interpreted by a context prefix server, which maps the component to a low-level identifier designating a server, then forwards the remainder of the name on to the server. The context prefix server provides a central repository for the names of servers, while the names of other objects are kept close to the objects themselves. In the case of context prefix servers, we have tried to avoid the drawbacks of centralized naming by providing multiple context prefix servers (one per user), and experimenting with a variety of mechanisms to obtain initial name definitions to be stored by the prefix servers. Planned future work includes experiments using a newly introduced group send mechanism in the V kernel@cite[CHERI84a] to map server names using the multicast technique mentioned above. Our approach is also shaped by the recognition that the system is, in part, a distributed database of information on the entities it implements. The name of an entity is just one of its attributes. Extending the name-handling mechanism to include a query operation on objects fits naturally into our model because the server interpreting a name generally also implements the named entity, allowing it to provide additional information about the entity with little difficulty. In contrast, extending a name server to include additional information about the entities it names exacerbates the problems discussed in the previous section, particularly that of maintaining consistency, Before presenting the details of the name-handling protocol, we describe the basic system environment.