PHONEBOOK SEARCH ENGINE FOR MOBILE P2P SOCIAL NETWORKS
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
PHONEBOOK SEARCH ENGINE FOR MOBILE P2P SOCIAL NETWORKS
Balázs Bakos, Lóránt Farkas and Jukka K. Nurminen
Nokia Research Center, P.O. Box 392, H-1461 BUDAPEST, Hungary
{balazs.bakos, lorant.farkas}@nokia.com
Nokia Research Center, P.O. Box 407, FIN-00045 NOKIA GROUP, Finland
jukka.k.nurminen@nokia.com
ABSTRACT In this paper we are investigating the simple question “I
Search engines generally lack the trust and need a reliable plumber close to my house”. As the word
personalization dimension needed for recommender “reliable” leaves a lot of room for interpretation we
systems to answer questions like: “I need a reliable consider the plumber reliable if your friends or
plumber close to my house”. In order to obtain acquaintances have used his services and have been
personalization, social relevance, and a decent amount of satisfied. We are thus using the social network of a person
privacy, the search database itself needs to be as a source of recommendations.
personalized. One possible dimension of personalization Using mobile phones to solve this problem is promising.
is the social neighborhood of the searcher. Most small businesses or craftsmen are anyhow using
One projection of a person’s social neighborhood is the mobile phones, the increasing capabilities of mobile
set of phonebook entries in a mobile phone. In particular, phones make such search applications possible, and for
the phonebook links represent a readily available the person searching for the service the mobile phone is
infrastructure to create a peer-to-peer social network for usually available, even in urgent situations. Most
socially relevant search. importantly, however, mobile phones already have the
In order to demonstrate the concept our work introduces a social network defined.
novel search engine algorithm for this kind of social The address book of the phone, and the phone number
networks. The implementation of the concept has been that it contains, link the phone users to each other forming
done on Series 60 Symbian platforms over GSM using a highly interconnected network of mobile phones. The
short message exchange. We also discuss privacy aspects network has a very high average node degree, proving a
and possible enhancements in the social dimension of the very high potential to become the underlying networking
search. layer for various applications. This is the approach we
followed in the present work.
KEY WORDS The rest of the paper is organized as follows: section II
peer-to-peer (P2P), time-to-live (TTL), search, ranking, presents an overview of the prior art. Section III analyses
smart phone the role of the phone contacts in the application. Section
IV presents our solution. Section V discusses the solution
from several aspects. Section VI discusses possible future
1. Introduction solutions, including some privacy and confidentiality
aspects, centralized solutions that would reduce the
Efficient information retrieval is increasingly important traffic, the tradeoff being reduced privacy. Finally section
both for our professional and personal life. A clear VII concludes our paper.
indication of this is the success of search systems such as
Google. While they are efficient tools to find product and
company information worldwide they are not very useful 2. Previous Work
in searching for small businesses in your neighbourhood,
e.g. reliable plumbers or cozy, small restaurants. The worldwide web with its generic or more specific
There are several reasons for the shortcomings. First, keyword search engines implements partially the same
many small businesses or craftsmen do not have web functionality as our approach. Searching a good
pages. Secondly, in choosing such a service the professional in a given area can be as simple as
recommendations of previous customers are often more introducing an appropriate combination of keywords in
important than the actual products of the company. There Google [1] or Yahoo [3], e.g. “plumber New York”. The
are no big differences between the services offered by outcomes will be more or less reliable and related to the
different plumbers, but the quality of the work can vary a plumber profession in the area of New York. Both Google
lot. Thirdly, the need for such services often arises in an and Yahoo offer additionally a yellow pages service
ex tempore fashion. containing names, addresses and phone numbers ofpersons claiming to be plumbers, along with maps where related to professions and jobs, while [8] is more related
they can be reached. The outcomes for the word to personal networks of friends and acquaintances. These
“plumber” are actually quite spectacular. systems are focused on specific search areas and are
However, if we turn now to other jobs/hobbies, like centralized: they all presume that the user data is present
“fisher”, there will be a huge difference. It turns out that in a central location and the introduced data closely
there are a large number of persons called Fisher. The matches the user himself and his social proximity. This
system does not differentiate very well between names has several drawbacks. Users must explicitly update and
and professions: the introduction of the word “fisher” in maintain their contacts in the system. Moreover the
the yellow pages will result in many Fishers, having scalability, security, and failure recovery problems of
various professions (e.g. attorney, doctor etc.) In addition, centralized systems apply.
yellow pages need to be updated in order to reflect recent
changes, but we would maybe like to have it done
automatically, not entirely possible in these systems. 3. Phonebook: Infrastructure versus
The search for places is also straightforward: yellow Distributed Database
pages help us here, too. Introducing the combination
“restaurant New York” will return dozens of restaurants The phonebook of a smart phone contains a large amount
of various kinds in the area of New York. However, this of relevant personal data. We can introduce here data
lacks the quality dimension: probably we would also like related to availability such as address, mobile and landline
to get a recommendation, more or less elaborate, perhaps number, fax, beeper, e-mail, URL. In addition to that we
something similar to what somebody could read in a can also introduce personal data such as the person’s
Lonely Planets book: good cooking, low budget, birthday. Furthermore, the phone contacts database lets us
absolutely to be tried. define programmatically as many fields as we like (for
[2] is a similar service for the UK. In addition to finding details see e.g. the Symbian Contacts Model [10].
people business search is also possible. In addition it is Unfortunately the very appealing idea of using the
possible to navigate in a list of “most wanted” businesses. phonebook not only as an infrastructure, but also as a
The search results are much more detailed than in the distributed database, has some limitations. To illustrate,
Google case. we compare in Table 1 certain aspects of a brief data
[1], [2] and [3] all lack the recommendation dimension: it mining of three sample phonebooks on smart phones in
is not possible to rank the person or the service. In our attempt to find a person whose profession is
addition, a possible third dimension is also missing: it is “painter”:
not possible to obtain hits in the social neighborhood of Table 1. Phone contact statistics
the requestor. To be more specific, this relates to the
Coll. 1 Coll. 2 Coll. 3
ranking dimension in the sense that somebody would “Painter” present x - -
probably rely more on the opinion of a second person if Has painter x X x
this person were his friend or the friend of a friend, in contacts
other words, a socially relevant person. Percentage of 3% 1% 1.5%
Previous work in the area of social proximity is manifold. contacts with
Applications ranging from the intelligent learning systems reference to
job/work
based on user contact logging and analysis to the other “Local” phone 10% 0.5% 12%
kind of systems adapting themselves to the mood and numbers
generic context of the user, these all deal with various As shown in Table 1., the phonebook, although fit for the
dimensions of social relevance. However, relatively few purpose of a highly connected social network, doesn’t
are the applications that use a mapping of social relevance contain a large amount of useful information. For
to a certain goodness of transferred data or content. [4] example, if somebody tried to look for meaningful
and [5] treat reputation and trust models in peer-to-peer information about persons (professionals: a painter,
networks, to some extent related to our application. persons having a given hobby etc.) or places (restaurants,
Reputation and trust of content sharers are evaluated and shops etc.), he would seldom find keywords in the
propagated based on a number of parameters: bandwidth, phonebook indicating that a given contact is a painter or a
quality of content, variety of content, type of content, given name is a restaurant’s. The following problems had
online/offline time ratio etc. This is fairly straightforward been detected:
in content sharing. However, when a person or a place is 1. From the three analyzed phonebooks all three
ranked, the possible set of parameters depends on the contained at least one contact being painters, but the
query itself: a professional should be rated on the phonebook contained the word “painter” only in 2 cases
professional quality of the person rather than the general out of 3. One possible explanation is that perhaps some
characteristics of him. Therefore a generic set of people generally store contacts under their names if they
parameters cannot be formulated in the phonebook search know them well enough and they keep in mind their jobs,
application for all possible cases. instead of keeping the job also in the contact list.
Finally, [6], [7] and [8] are examples of centralized 2. More generally speaking, usually only a small
databases of communities of various kinds: [6] and [7] are percentage of contacts have their job field filled in thephonebook. In addition, sometimes the profession is 1.
The name/availability of the persons having the
stored under the name, so a full keyword search in all the persons matching the initial query in their
text fields is needed in order to find the relevant contacts. phonebook. It is thus possible to contact those
In addition, in most languages some family names mean persons ask discuss their experiences.
in fact professions, possibly leading to false hits. 2. The ranks, given by these persons, as a first
3. Phone numbers are not always shown correctly by approximation of the quality of the person in the
some GSM networks. If the GSM number is local, some given area.
networks would display it in its local form, e.g. 0620… One of the key aspects for an efficient search is an
instead of +3620…. If the user stores the number from the adequately built profile. In previous sections we have
phone logs, the number will not be relevant when trying already shown that simple keyword search can be
to use it from abroad. Also in this case a phonebook problematic if there are similar items in other areas, like
crawler would not know the fully qualified phone was in the case of the word “fisher”. In our phonebook
number. In addition, the user might fill the wrong field, crawler the profile consists of three different types of
e.g. the mobile number instead of the landline. In this case data:
a short message would not even make sense. In fact, 1. personal data
mobile numbers might be stored in landline fields and 2. professional information
vice-versa; the label of the field does not guarantee the 3. interest.
type of phone number, only the fact that the stored A sample profile and phonebook are shown in Table 2.
number has an adequate phone number format.
Table 2: Own profile and phonebook
4. There is always the problem of users having different
mother tongues. People generally use their mother tongue P rofile
in their mobiles. This makes it very hard to find relevant Own
phone Na me E -‐mail UR L P rofes s ion Interes ts
36
4 45
5 6577 J ohn
D oe J ohn@doe.com D oe.com Trainer tennis
data unless some kinds of translation plugins are
C ontacts
employed. This is especially true in multinational
P hone G roup
environments when people of different nationalities get 36
4 45
5 6577 S earch
socially close. 36
2 576
7 6 S earch
From these statements it is obvious that the phonebooks 36
6 7467
6 47 F riends
can be regarded today more as an appropriate As shown in Table 2, in the profile is currently a
infrastructure than a universal container of user-related predefined basic set of fields that can be queried. In future
significant data. Therefore we follow this second scenarios this could be extended: some fields could take
approach of using the phonebook only as a link container, value from a predefined set (e.g. marital status), others
not as a source of user-related details. could be numerical (e.g. phone number, age) and yet
others could strings or sets of strings (e.g. personal
interests, professional experience) in which the user is
4. Our Solution expected to introduce as many keywords as he likes.
It is important to notice that typically the person has very
The phonebook is an “always on” resource: our contacts few entries in the profile table (typically just a single
do not change their mobile number very often on one entry). This is the only data that the user has to explicitly
hand. On the other hand, the mobile phone of our friend update. The contacts table corresponds to the standard
might be momentarily switched off, however, when he address book data of the phone (which the user is anyhow
switches it on, the query would be serviced. Phonebook likely to maintain). As the contacts table is only used to
search is the materialization of our attempt to implement a
forward the queries to the linked persons the attribute
search engine on this infrastructure. In our
values, e.g. the name of the contact, do not have any
implementation we have used short messaging (SMS) to
effect to the search.
transfer messages between phones. Obviously it is not
optimal for this use case but as it is widely available and The person will be found based on the profile, using
supported in almost all phone models.. In Sections V and standard “SELECT * FROM profile WHERE
VI the various aspects of messaging technologies will be ” type of SQL queries. The more fields of the
covered in more detail. profile are filled, the more likely he will be found during a
In the most basic form the application provides the search operation and the more areas he will be related to.
keyword search. If a user simply wants to find a For instance, the user can introduce a large number of
professional, it is as simple as introducing it in the query words in the field “Interests” and will be found
screen and press a button, results will then arrive to the accordingly by queries regarding each word. It is his
screen of his phone. For more advanced use it is relevant interest to do so.
that the search can be executed in two distinct phases. In It is to be noted that, as shown at the end of Section III,
the first phase the keyword search is executed; later, in the phone stores only the user profile of the owner. So it
the second phase, the user can request a recommendation is stored now in a simple SQL database having only one
of the hits, utilizing the social dimension of the search: record, that of the phone’s user. Incoming queries are
simply matched with this record.The search mechanism is shown in Fig. 1. in more detail. The list of replies does not show all the details of the
An example of query could be: search for a person whose reply message, only the phone number and the name of
job is plumber and whose address contains the string the person representing a match of the query. Depending
Budapest. The user introduces the desired query in his on how much information this person had just disclosed
phone (step 1). In the next phase (step 2) the phone sends in step 6, the list items can be further expanded to show
the query with the following parameters: all the received details of the match. It is a matter of
Phonebook
contains
profiles
of
contacts:
3.
Searches
i ts
own
settings how many of these details will a reply message
profile
for
matches
Phone4
contains, this can depend on whether the asking person is
3.
Searches
i ts
own
profile
2.
Search
sent
to
all/some Phone2 4.
Forwards
to
own
contacts
persons
i n
the
phonebook
(depending
on
search
neighborhood)
As already stated the usage can stop here. In the case
5.
Returns
matches
3.
Searches
i ts
Phone5 when the user wants to get more precise
own
profile
for
matches
Phone1 2.
Search
sent
to
all/some information/social relevance, he may continue with a rank
persons
i n
the
phonebook
Bill’s
phone
request. This may have two distinct purposes: (i) to find
1.
Search:
job
=
plumber Phone3
5.
Returns
matches
3.
Searches
i ts
own
profile
(plumber) out the people who had a link to the found person and
6.
Matches
shown
anonymously: for
matches
Bill
plumber
(1
hit
) (ii) which is their evaluation about him, with respect to
Bob
plumber
(2
hit) Bob’s
phone
(plumber) the given query. In this case the people who have the
3.
Searches
i ts
own
profile
for
matches
found match in their phonebook will be alerted and they
are prompted to if (i) they like to reveal there identity to
Fig. 1. Search mechanism the requestor (ii) rank the person. If the user is not willing
to the predefined contacts (in the settings it could
• to respond then “no” answers are assumed. The ranking
be set to “everybody”, “predefined group”, and mechanism is shown in Fig. 3.
Phone
1
is
asking
“nobody”) info
about
Bob
Can
be
turned
always
off the
painter?
• with a preset time to live (TTL) field (in the 4.
Rank
=
by
a
preference
selection Reveal
your
4
from
< identity?
settings this can be set), having as effect a larger 050-‐ 2346
78 > Rank
Bob?
or smaller propagation horizon in the social
4.
Rank
=
5
from
< 051-‐987654> Phone4
neighborhood war
d 3.
Display
rank
request
> k
for
neno Phone2 .
Ran
The mobiles receiving the query check a match with their < Bill’s
pho 2
uest:
2.
R
ank
k
req
profile (step 3). If there is a match, a query hit message is 1 .
Ran
f orwa
rd Phone5
returned to the originator (step 5). In all cases, if the TTL Phone1 1.
Rank
req
3.
Display
rank
request
uest:
< Bo
is not expired (value = 1), it is decremented and the query 4.
Rank
=
b’s
ph oneno> 3.
Display
rank
request
5.
Ranks
shown
5
from
< 052-‐ 13 Phone3 Bill’s
phone
is forwarded to the contacts of this phone (step 4). The 5792>
query will reach all the phones within the range TTL of 3.
Display
rank
request
Bob’s
phone
contacts of the query initiator and all query hits will be
returned to the originator. Fig. 3. Ranking mechanism
It is important to note that the returned query hit (step
4) does not reveal who where the persons who had as Fig.4. shows two example screens when the user checks
contact the person matching the query, so the privacy of for the still unsolved rank requests (marked with a star).
these people is not violated. It is also emphasized that this
works automatically: no user intervention is required (on
the other hand the user might set the application to reply
and forward only if he explicitly chooses to reply and/or
forward).
The replies are returned to the phone generating the
query and the user is alerted about the incoming reply
messages (step 6). In addition to that, not shown in Fig. Fig. 4. a)List of received rank requests; b) Detailed
1., the reply messages are stored in a list of replies that
can always be examined, as shown in Fig. 2a. The rank request is sent by the phone (step 1) after
selecting the desired hit in the query hit list and pressing
“Rate”. The rank request follows the same path and TTL
patterns as the query itself. The mobiles in the path
receive the rank request and alert the user of the rank
request to be solved (step 3) but this only happens if the
person to be ranked is found to be its own contact (e.g.
phone 4 will display the rank request, but phone 2 won’t,
since Bob is not his contact). This will be displayed in the
Fig.2. a)List of replies b)reply in detail same manner as the new SMS alert, using a sound alert
and a modal dialog on the phone screen.In all cases, if the TTL did not expire, the rank request is One alternative is an incorporated profile exchange
forwarded (step 2). By checking the list of rank requests mechanism that would make it feasible to store the profile
(new requests marked with a star, as shown in Fig. 4.) and of a given contact in an expanded form in the phonebook
rating, the rate solve message will propagate to the querier or in a mirror database. The profile exchange could be
(step 4) and it will be stored in the query hit in the triggered by certain events, like for instance a call
“Rates” field, shown in Fig. 5. initiated towards this person or a short message sent to
him. The advantage of this solution is that it can save one
step in the path of the query: it is enough for the query to
reach a person having the profile of the match, obtained
e.g. through previous profile exchange. The main
drawback is the need for the additional step of profile
exchange and the need of an extended phonebook or
mirror database – additional storage capacity. A second
Fig. 5. Contacts without and with received ranks drawback is the possibility of malicious contact data
Additionally the application can also display the contact mining: as a reply to an empty query the replier would
return each of his contacts. An application could protect
details of the persons who gave the ranks, so ultimately
the user from this by setting a limit to the minimal
the user can call them and ask them personally, what they
number of significant characters in a query, or through
think about the person matching the query. some other method, but anyway it is a problem that has to
be additionally handled.
The second alternative is the one presented in Section IV
5. Discussion in which the user stores only his own profile e.g. in the
form of a plain text file. In this case the phone contacts
5.1. SIM and Costs are used only as the networking layer. Hits are returned to
the search originator if the query matches this profile,
A mobile phonebook contains on average 50 to 100 stored in the text file. The advantage of this solution is
entries (the number is not statistically correct, 10 samples that it doesn’t require an extended phonebook or
had been taken, colleagues and persons involved in additional database mirror, along with maintenance and
academic research). In most cases the entries consist of updating tasks, generated by the profile exchange events.
names and phone numbers. In earlier mobile phone The drawback is that it requires one step more, since the
models these were stored on the SIM, which didn’t allow contact of a person whose profile matches the query does
for introducing other meaningful data about contacts. not know if the match will take place, so this additional
In the further scenarios other technologies than SMS will hop from the contact to the person with profile match is
probably be employed. However, for the time given, it is additionally needed. Weighing the advantages and
recommended to use the “Selected group” setting instead drawbacks of the two solutions, it turned out that the
of “Send to everybody”, since the cost of a short message second solution is better, therefore we incorporated this
is still not neglectable today, multiplying that by the one in our application.
number of contacts the cost of one search for the querier
can be evaluated as 50 to 100 times the cost of an SMS. 5.3 Speed
Additionally, the same applies for forwarded queries:
even if somebody doesn’t query at all but forwards In case of TTL = 1, the results would come mostly within
queries of others towards his own contacts, the costs 10 seconds. This is the average round trip time measured
generated for a user of his contacts using the application in the networks of local operators. Increasing the TTL,
can become excessively high. further results will come in also later. The exact time
intervals also depend on the operator policy with respect
5.2 Profile Exchange to minimal time intervals accepted from a different
operator towards own subscriber. In an extreme case of
As shown in Section 3, smart phones offer more enhanced switched of mobiles some results could also arrive hours
possibilities: it is possible today to store a number of later. The essential probably is that in the case of basic
entries, among which numbers, addresses, some short message exchange one would receive query hits
professional information and e-mail addresses. Sometimes within some seconds or at most tens of seconds, once
it is also possible to use a field for additional comments, there are persons matching the query.
where text can be entered about a given entry. However,
as shown in Section III, the profile contained in the 5.4. Startup Ramp and Usage Patterns
phonebook itself is not detailed enough: the users simply
do not introduce details about their contacts enough to The application deployment starts to be useful if there are
build a full-featured distributed database. Two alternative enough users with their application running. If the
solutions can be proposed, as follows. application does not run in a phone, the received
messages would appear as normal short messages.It is also to be noted that a person not interested in A more appealing alternative would be a centralized
revealing his own details has not much use in running the solution in which a central database would be used to
application. It actually has drawbacks since he forwards store the messages or alternatively the phone link
deliberately the short messages on behalf of other users, networks of individual users. This however raises even
generating unsolicited costs for himself. Further, the more the need for security, privacy and authentication.
“free-rider” problem is also applicable: there could be It has to be emphasized that this “centralized” solution is
users there who search frequently but in addition to that different from the centralized solutions described in
they keep their application closed, so do not collaborate in Section II. Here the storage of messages and/or the
the forwarding. The application tries to avoid that to a storage of links between mobiles would be also centrally
certain extent by providing a unique value for the TTL stored. The search then could be executed in this
and the search group, both for own queries and forwarded centralized database and the number of message
queries. It would be then unpractical for a “free-rider” to exchanges would be largely decreased. The distributed
keep on switching the TTL and the search group. version could also function as a backup solution if the
central database fails.
In the centralized case we have to decide exactly what
6. Future Work will be stored in the database. Two options exist:
1. Store the phone numbers, profiles and phone link
6.1. Privacy networks
2. Store only the query and rank request messages for
Our solution solves most of the privacy concerns typical the individual case.
to this kind of applications. One exception from this is Both solutions have potential advantages and drawbacks
described in the following. If the messages are sent via but worth to be considered as enhancements.
SMS as we implemented it in our application, each user
can know the message originator and the content. This
could lead to problems in cases when the searcher is not 7. Conclusions
willing to disclose the subject of his query to anybody
except the person himself matching the query. Possible In our work we present a phonebook crawler application
use categories could include trade secrets, client attorney that return socially significant hits to queries about
privilege or medical non-disclosure agreements. A persons and places. The novelty is the communication
possible solution could have at its base public infrastructure, which is the contact list. In order to obtain
cryptography. If user A is sending a query with encrypted useful hits the users need to introduce their own profiles.
by his public key, the other users encrypt their own We put the idea into practice by implementing it on the
profile fields using this key but will not be able to actually Nokia 6600 platform and testing it over a couple of usage
decipher the query. The hit message is returned the same scenarios. Possible enhancements are suggested from the
way as in the non-encrypted case. viewpoint of privacy and eventual partial or full
centralization of the database of profiles.
6.2 Other Messaging Technologies
References:
In the current generation of cellular networks an
alternative to SMS could be our earlier work [10]. [1] http://www.google.com/help/features.html#wp,
Cellphones connected via GPRS in various fashions Google phonebook
would be a cheaper alternative. However, advanced [2] http://192.com, UK directory enquiry system
network maintenance would be necessary because of [3] http://people.yahoo.com, Yahoo! People search
frequent disconnections met in GPRS, see [11] for details. [4] Y. Wang, J. Vassileva, Trust and reputation model in
In the next generation cellular networks presence peer-to-peer networks, Proc. 3rd IEEE Int. Conf. On
information is stored and accessible by phones through Peer-to-Peer Computing, Linköping, Sweden, 2003
IMS/SIP technologies, this being probably a cheaper [5] S. Marti, H. Garcia-Molina, Identity crisis:
solution and an enhanced version of the phonebook anonymity vs. reputation in P2P systems, Proc. 3rd
network. This would work as follows: the phonebook is IEEE Int. Conf. On Peer-to-Peer Computing,
extended to contain additionally the SIP address of the Linköping, Sweden, 2003
contacts in addition to ordinary phone numbers. Having [6] US Patent Application 2003/45050, System and
that, UDP packets can be sent between the phones or method for the provision of socially relevant
alternatively, TCP connections can also be established recommendations
between them. The messages would be than carried by IP [7] Spoke, www.spoke.com
instead of SMS, a much cheaper alternative at least in the [8] LinkedIn, www.linkedin.com
today scenario. But the whole concept of the application [9] Friendster, www.friendster.com
would remain unchanged. [10] The Symbian Contacts model,
http://www.symbian.com/developer/techlib/v70docs/
6.3. Centralized solutionsdl_v7.0/doc_source/reference/cpp/contactsmodel/ind
ex.html
[11] B. Bakos et. al., Peer-to-peer Content Sharing in
Wireless Networks, Proc. 15th Int. Symposium on
Personal, Indoor and Mobile Communications,
Barcelona, Spain, 2004You can also read