@part(approaches, root "text")
@section(Approaches to Naming)@label(approaches)

The nature of a distributed system, in which servers and objects
are placed at diverse locations,
makes the design of a uniform
and efficient name-handling mechanism a difficult problem.
One difficulty consists in choosing where name mapping and interpretation
is to be done.
We identify and contrast two possible solutions below.

@subsection(Two Models)
In one model,
a logically centralized @i[name server] provides name mapping as
a service.  This @i[centralized]
model is motivated primarily by the considerations of
uniformity and minimal duplication of function mentioned above.
Several distributed system designs@Cite(NEED82,OPPEN81,WATS81)
have identified @i[naming] as a service in this way and provided a
distinguished @i[name server] to perform this service.
Ideally, every server, object, and service in such a system is registered
with the name server, and clients present the registered names
to the name server when referring to these entities.

The second, @i[distributed] model
stores the names with the objects 
themselves.  This approach is motivated by considerations of
efficiency, reliability, and extensibility.  By itself, this
approach does not seem to provide a full solution to the naming problem,
since it is not clear in general how to find an object given its name,
if the name is stored with the object.

There are many hybrid approaches possible.  For example, a logically
centralized name server can be given a distributed implementation, in
which definitions of names are placed close (in terms of communication
costs) to the named objects.  The basic model is the same, but
distributed techniques are used to increase efficiency.
In contrast, we have chosen to explore a hybrid
based on the distributed model, with centralized techiniques used
only when absolutely necessary to provide the needed functionality.
Some of our reasons for taking this approach are indicated in the
comparison below.

@subsection(Comparison)

While the centralized approach
has the advantage of localizing the name-handling
operations to one server and thus imposing some level of uniformity
on the system, there are several advantages
to the distributed approach as well.

@b[Efficiency.] Separating the name of an object from its implementation
introduces the extra cost of interacting with one more 
server_the name server_every time a name is referenced.
Caching the name in the client would introduce inconsistency problems
and only benefit the few applications that reuse names.
Because of this cost, there are few name servers
that implement file names separate from a file server,
even though the name servers
implement names for many other system entities such as hosts and users.

@b[Consistency.]
Separating the naming implementation
from the implementation of the named entity makes it more difficult to
ensure the name server's information is kept consistent with the
objects being named.
For example, deleting a named object requires notifying the name server
that its name for the object is invalid.
If one of the servers crashes during the operation,
the system will be left inconsistent unless
deletion as performed as a multi-server atomic transaction.
Such solutions to the consistency problem reduce the efficiency
of using name servers.
Alternatively, many servers and client programs must be prepared to deal with
inconsistency in the name service.

@b[Fewer levels of naming.]
If objects and their names are kept together, mapping from a name to
its associated object is an internal operation for the server that
maintains both.  A name server, on the other hand,
cannot map a name to an entity,
but only to another name that can be used directly with the server
implementing the entity.
Thus, an additional level of naming is required between the name server
and other system servers.
A common design is to use low-level globally unique identifiers
(e.g., 48-bit values), with the view that such identifiers are efficient
to communicate and manipulate.

We hold the view that
unique identifiers are a generalization of identifiers used internally
in, for example, file system implementations.
Making such identifiers externally visible, 
and requiring them to be of a uniform format and
globally unique,
either imposes a uniform scheme of internal naming on all servers,
or forces the unique identifiers
to be treated as purely external names
which are mapped by each server to the lower-level identifiers
it uses internally.

@b[Extensibility.]
A distributed system typically includes numerous different
@i[established] name spaces, name-handling servers, and interpretations.
For example, the names for mailboxes, such as ``cheriton@@su-score.ARPA,''
may be imposed by standards established outside of the system in question.
Such preexisting servers fit well into a model in which names are
normally interpreted by the server providing the named objects,
but are difficult to accommodate in a system using
a name server that translates all names into low-level universal identifiers.

@b[Reliability:]
If an object's name is stored with the object, the name will always be
accessible if the object itself is accessible.  A name server, on
the other hand, represents a central failure point, and its failure can
cause a situation in which objects existing at locations where there
have been no failures are inaccessible because they cannot be named.

Although the distributed model offers a number of advantages,
it also has some drawbacks.  The distributed model works best
in the case where a class of objects
is implemented by a server, and the objects are stored near the server.
Files provided by a storage server are an excellent example; it is convenient
to store file names in directory files on the same storage medium as the
files they name, and to implement the naming within the storage server.
As another example,
servers that provide a small number of transient objects_for instance, virtual
terminal servers_can store names and attributes of the objects in memory.

Extending the distributed model to cover the entire space of
named entities in a distributed system is more difficult.
It is advantageous for each server to provide name mapping for the
objects it implements, for the reasons described above, but it is
not clear how to define names for the servers themselves.  One possibility
is to provide one or more name servers to map from server names
to addresses or other low-level identifiers for servers.  
This partially centralized
method shares several of the drawbacks of fully
centralized naming schemes.

Another method would be to have each server store its own name.
A name mapping request could then be broadcast or multicast to
a group of servers, and each server would compare the specified
name with its own name.  This technique introduces an additional
cost, in that each server in the group receives many requests that
are not directed to it, and must spend some processing time in
examining and discarding them.  There are also potential problems of 
consistency_some care is required to prevent
two non-identical servers
from storing identical names for themselves.

@subsection(The V Naming Model)
In the V naming model, we have combined aspects of the centralized
and distributed models using the technique of @i[distributed
name interpretation].  Names may have more than one component (i.e., they
may be hierarchical), and different components may be interpreted by
different servers.  For example, the first component may be interpreted
by a context prefix server, which maps the component to a low-level
identifier designating a server, then forwards the remainder of the
name on to the server.  The context prefix server provides a central
repository for the names of servers, while the names of other objects
are kept close to the objects themselves.

In the case of context prefix servers, we have tried to avoid the drawbacks of
centralized naming by providing multiple context prefix
servers (one per user), and experimenting with a variety of mechanisms
to obtain initial name definitions to be stored by the prefix servers.
Planned future work includes experiments
using a newly introduced group send mechanism
in the V kernel@cite[CHERI84a] to map server names using the multicast
technique mentioned above.

Our approach is also shaped by the recognition that
the system is, in part, a distributed database of information on the entities
it implements.
The name of an entity is just one of its attributes.
Extending the name-handling mechanism to include a query operation on objects
fits naturally into our model because the server interpreting a name
generally also implements the named entity, allowing it to provide
additional information about the entity with little difficulty.
In contrast, extending a name server to include additional
information about the entities it names exacerbates the problems
discussed in the previous section, particularly that of
maintaining consistency,

Before presenting the details of the name-handling protocol,
we describe the basic system environment.