Development of a Tor library



Given the increasing popularity of Tor following the Edward Snowdon revelations, the dark net has

been the subject of much debate, both amongst academics and in the media. Existing Tor libraries

require Tor to be installed on the user's PC and are currently incapable of conducting an attack against

Tor, which would be advantageous to law enforcement agencies on local, national and international

levels. This leaves room for further development within the field and it was with these reasons in mind

that the current study was undertaken.

Aiming to create an intuitive application that is not reliant on an existing Tor client, this project aims

to radicalise the field by creating a Tor library that is capable of providing a foundation for further

development that could lead to the de-anonymization of Tor users. The outcome of this project was

that a fully functioning independent library was created that, following extensive testing, was capable

of conducting an initial Denial of Service attack against a Tor node.

The results of this attack were inconclusive, however this serves to demonstrate that the developer

has fulfilled his objective of creating a forward-looking library that provides a solid base for future

development in the field. It is hoped that the research and code developed throughout this project

will contribute to the development of an attack on Tot that can de-anonymise users of the network

and thus make a valuable contribution to law enforcement and the field of cyber security.


I wish to thank Dr. Gareth Owen for providing valuable guidance and support throughout the course of the project in his role as project supervisor

1. Introduction

1.1. Introduction This introduction discusses the motivation and reasons behind this project, as well the project

objectives and constraints before concluding with a brief summary of all the chapters in this report.

1.2. Rationale Since the Edward Snowden revelations regarding government surveillance, more and more people

have been wanting privacy and anonymity when using the internet. The Onion Router (hereafter

known as Tor) provides such a service, and is widely used by people wanting to remain anonymous on

the internet and in countries where internet censorship is a major issue.

As well as providing anonymity to internet users, Tor also allows a website or service to be hosted

within the Tor network, these are known as hidden services. A normal website is hosted on the

internet; the hosting location can be found and visitors to the site can be monitored, but a hidden

service provides the person hosting the service and the site’s users with anonymity. This can lead to

websites containing illegal material such as child pornography or drugs being hosted on Tor as both

the users and the host are guaranteed complete anonymity. This demonstrates that being anonymous

on the internet brings with it a lot of issues, one of which is accountability. For example, in the event

that somebody had used the internet for nefarious means, how can we prove they accessed a certain

website or illegal service?

1.3. The problem Despite the increased exposure and popularity of Tor, development in the field is currently almost

non-existent. Although many research papers look at theoretical attacks on Tor (Borisov et al, 2013;

Biryukov et al; 2013b; Jansen, 2014), there is not currently a Tor application, library or framework that

would facilitate these theoretical attacks.

Currently, to use Tor, a user downloads and installs it to their machine. The application is very limited

in terms of its use; it is currently only designed to connect a user to the Tor network and allow them

to use a pre-configured web browser to use the internet anonymously.

Although is an open-source project, meaning development of the Tor client is possible, the sheer size

and complexity of the application makes developing it extremely challenging. This means there is no

room to easily build upon or further develop the current application, for example to increase the

security of Tor or to program attacks to de-anonymise users.

It is clear that there is a need for an application that can connect a user to the Tor network, which

provides all the functionality of the current Tor application but which is also able to be expanded and

built up, with a final goal of implementing an attack such as a denial of service or de-anonymization


1.4. Objectives The objective of the project is to create a Tor library which, unlike previous Tor libraries, does not

require the user to install Tor. This will make this project unique. It will also make the project more

challenging as the application will need to use the Tor protocol to communicate with the Tor network.

This involves challenges such as encrypting and decrypting packets to Tor nodes and communicating

with hidden services. This project aims to develop a fully functional library whose functionality is

comparable to that of the Tor client and, in the wider context, which will be able to be used for future

development for projects such as attacking or hacking Tor.

1.5. Constraints With all projects many constraints faced during the development of the project, both internally and


Hard deadline – 12th September 2014. On this day all project deliverables must be handed in

to the client.

Zero budget

Availability of software – Although the software used will be open source due to the budget

constraint, support or access to the software may be limited.

Knowledge of the problem is limited, and in order to gain the necessary understanding of Tor,

as well as learning new skills required for the project, will require a large amount of time.

Other commitments will interfere with the project; good project and time management are


1.6. Project Deliverables The list of project deliverables for this project is:

A Tor library that can connect to Tor, Tor’s hidden services and which can be expanded on

A report to document the development of the application consisting of:

o Literature review

o Project Management

o Specification and discussions of the requirements

o Application Development

o Summary of the project

o Evaluation against requirements

o Conclusion of the project

o Bibliography

1.7. Report Structure overview Chapter Title Synopsis

1 Introduction

2 Literature review Firstly this chapter analyses what methods are available to ensure the most appropriate documents were chosen to review. Next is the literature review itself, this will cover the topic areas of:

What is Tor? (Historical and technical discussion)

How does Tor provide its users with anonymity and privacy?

Uses of Tor

What is the social impact of Tor?

Alternatives to Tor 3 Project

management Discusses why the method of project management chosen for this project was the most appropriate choice and the process that would be used to elicit requirements for the project. Next, a risk analysis and various countermeasures are discussed before concluding with a brief discussion of the intended schedule and any relevant professional issues that could arise.

4 Application Development

This chapter looks at each iteration stage of development. The requirements of each iteration are discussed, followed by a description of the design process and the implementation

procedure. The testing of each iteration is explained as well as concluding which features will be carried forward or dropped from the application in the following iteration.

5 Evaluation against requirements

Evaluates whether the project has met the requirements outlined earlier in the development. It will look into how the requirements have been meet or exceeded, before looking at requirements that have not been met, providing an explanation of why these have not been met and a recommendation of how these may be achieved.

6 Conclusion of the project

Reflects on the project as a whole. Looks at how the project has been developed, what mistakes were made during the development as well as what has been achieved. This chapter discusses what the project has brought to the field of information security and possible future development for the project.

7 Bibliography Lists all the references used throughout this report, as well as sources used throughout the development of the project.

8 Appendixes Contain all the figures, tables, and section of code along with any other information such as testing strategies that accompany the report.

4 | P a g e

2. Literature Review

2.1. Introduction Tor used to be unheard of, except within the tech community and amongst users of illegal sites.

However, since 2013, when the Edward Snowden leaks revealed that GCHQ and the NSA are unable

to de-anonymize all Tor users, Tor has been cast into the public eye. Now with 200,000 daily connected

users (Tor Project, 2014) Tor and its use is debated more than ever.

This report examines Tor in great detail; what it is, who uses it and for what reasons. It will then discuss

the illegal side of Tor, before looking into why Tor needs to be investigated and what has been done

so far.

Further research and a review of the relevant literature are carried out throughout the project, as

research into the Tor protocol as well as research into current attacks, not necessarily against Tor, will

be required.

2.2. Analysis

What is Tor? Tor is an open source project that is a decentralized low latency mix network of specially configured

nodes, commonly called relays or bridges, which transmits only TCP traffic through virtual tunnels

from a client to a destination, usually, but not exclusively, the Internet.

Tor has been incorrectly described as having a single authority and being a Virtual Private Network

(VPN) or a peer-to-peer (P2P) network (Hurley et al., 2013, p. 1); but this is not the case.

Tor is described by McCoy et al. as a “privacy enhancing system, designed to protect the privacy of

Internet users from traffic analysis” (McCoy et al., 2008, p. 63). Syverson elaborates on this, stating

that Tor provides anonymous connections to the Internet providing protection against traffic analysis

as well as eavesdropping (Syverson, Goldschlag & Reed, 1997, p. 44). Both groups of scholars agree

that Tor provides anonymity.

As well providing users with anonymity and privacy, Tor can be a tool for anti-censorship. Endorsed

by the Electronic Frontier Foundation and other civil liberties groups, one use of Tor is as a means of

communication between journalists, whistle-blowers and human rights workers. (Levine, 2014)

Tor is also the network supporting what the media commonly refer to as the 'Darknet', due to the

encrypted nature of the network and its association with illegal, or ‘dark’, activities.

History of Tor Roger Dingledine is largely credited for being one of Tor’s creators, in 2004 he was part of the team

that released the paper Tor: The Second-Generation Onion Router (Dingledine, Matthewson &

Syverson, 2004). Levine explains how the concept behind Tor, “Onion routing”, can be traced back to

1995, and this should be considered the origin of Tor (Levine, 2014).

Michael Reed, one of Tor’s creators, revealed Tor was originally created for military intelligence usage,

such as open source intelligence gathering, and the reason for releasing it open source was to provide

better cover traffic to hide what the network was really being used for (Reed, 2011). Reed expands

upon this, adding that Tor was not designed for helping dissidents in repressive countries or criminals

to avoid law enforcement (Reed, 2011).

Until 2013, Tor was almost unheard of expect amongst technology and criminal circles. In 2013,

Edward Snowden made a series of high profile disclosures about several global surveillance programs.

One leaked document, “Tor Stinks”, made international news, and describes how the NSA tried to

compromise Tor anonymity (The Guardian, 2013).

Silk Road, a Tor hidden service which is also known as the eBay for drugs, was an online marketplace

where users could buy and sell drugs around the world (Barratt, 2012, p. 683) and was hailed as being

a “criminal innovation” (Aldridge & Décary-Hétu, 2014). It was taken offline in 2013 by the FBI. This

was the first public demonstration of a government agency taking down a website hosted on Tor, and

attracted significant media attention internationally (Greenberg, A., 2013b).

However, the effectiveness of the closure of the Silk Road has been questioned by Greenberg, who

reports that Silk Road 2.0, an updated and improved Silk Road, was online just one month after the

closure of Silk Road (Greenberg, A., 2013a).

Since these leaks and the FBI takedown of the Silk Road, Tor has never been more popular (Jeffries,

2013); its popularity has sky-rocketed and it now has around 2.5 million monthly users (Tor Project,


How Tor works Tor is built on the second generation onion routing. Syverson et al. (2000, p. 1) states one of the major

reasons in the change of design from gen 1 to gen 2 is to be able to release the source code for public

distribution as the patent of generation 1 onion routing preventing this. Dingledine et al. (2004, p. 1)

describe how another of the major reasons for changing the design of onion routing was to solve

“many critical design and deployment issues that were never resolved” as well as stating the “design

has not been updated in years”.

Tor creates a “circuit” through several, usually three, Tor nodes; a Guard (Entry) node, a relay, and an

exit node. Borisov et al. (Borisov et al., 2007) strengthen the case for selecting three nodes, by

demonstrating how increased circuit lengths compromises anonymity.

With the nodes selected, and the user wishing to send a message through the Tor network, the

message is encrypted like an onion with all three of the nodes Keys.

The Diffie-Hellman (DH) handshake is used between each relay and the client to create the session

key (Jagerman et al., 2014, p. 4). However the DH handshake has also been criticized, with some

scholars proposing an ElGamal key agreement based protocol, although this has currently not been

implemented (Øverlier & Syverson, 2007, p. 4). Catalano is also critical of the current onion routing

protocol, claiming it has a high round complexity which affects the running time, although he agrees

that Tor onion routing provides forward secrecy and is secure (Catalano et al., 2011, p. 255). Loesing

is argues that using the DH key exchange makes “building circuits in Tor is a time consuming task”

although he fails to fully justify his reasons for this statement (Loesing, 2009, p. 48).

Dingledine et al. (2004), Øverlier and Syverson (2007), and Loesing (2009) all acknowledge that using

DH to create the circuits in Tor achieves perfect forward secrecy, with Loesing explaining how if an

attacker were to collect and store all traffic, to try and force the nodes to decrypt it would be

ineffective due to the “telescoping” approach (2009, p. 45). This means that once the session keys are

deleted a relay cannot be forced to decrypt old traffic. In a similar vein, Øverlier & Syverson (2007, p.

3), demonstrate how, due to the forward secrecy feature of Tor, attacks do not succeed.

The message is encrypted with the last node’s key first, then the encrypted message is encrypted again

with the second node’s key, before finally encrypting this message with the first node’s key.

Onion routing is shown in the image below:

Figure 1 - Onion routing circuit

Dingledine et al. (2004, p. 5) provide an explanation of how a message that is to be sent through an

onion routing network is then decrypted, with the message arriving at and being decrypted by the first

node; here the only information that will be known is which node to send the packet to. This will be

done at each node through the circuit, until the message is sent out to the Internet. The removal of a

layer of encryption is like peeling the layers of an onion, hence the name onion routing.

Apart from the three main nodes, Entry, relay, and exit, there is another type of relay, a bridge; this is

a relay that is not listed on the consensus, and is used as the first hop in the circuit to route traffic out

of a certain country.

Ling et al. (2012, p. 2381) conclude that Tor bridges are critical to counter censorship blocking. Winter

and Lindskog reinforce this viewpoint by explaining how China is blocking all public relays in the

consensus then stating that only “1.6% of public relays are able to be connected to” (2012, p. 11)

which shows that, since the bridges are unknown, they cannot be connected to.

Hidden services Services such as websites can also be hosted within the Tor network. These are known as “Hidden

Services” and can only be visited through Tor. Tor Hidden Services were added in 2004, when the

second generation of onion routing was developed (Dingledine et al., 2004).

Loesing (2009, p. 36) describes how Tor offers a TCP-based service to be accessed while concealing

the identity of the hosting servers IP address. This enables a user (Alice) to connect to another user

(Bob’s) server without knowing where, or who, it is. Loesing (2009, p. 40) also brilliantly sums up the

hidden service design as, “connecting two circuits created by client and server on a common

rendezvous point”.

Biryukov et al. (2013, p. 82) expand on Loesing’s explanation, stating that all communication between

a client and the hidden service is done through a rendezvous point (RP) which connect circuits from

the user to the hidden service. These RPs are mutually agreed points (Dingledine et al, 2004).

Data Data Data Dat Unencrypted unless using HTTPS

Guard node

Relay node

Exit node

Tor protocol

Dingledine et al. (2004) also discuss how to

connect to a hidden service; the user tells

the hidden service what RP will be used,

first using a hidden service descriptor to

search the distributed hash table, a lookup

service to inform the user of which

induction points (IP) are servicing the

hidden service. The client will then

communicate with the IP of the RP which

will then be used to communicate with the

hidden service.

Who uses Tor and for what? Tor was originally designed for military

intelligence gathering (Reed, 2011),

however it has now become more diverse;

whilst it is still used by the military and law

enforcement, its users now include activists

reporting abuse from danger zones (Tor

Project, 2014b).

This is particularly relevant given the

continued fighting in areas such as Iran and Syria; Tor has been instrumental in getting reports of

information from within these countries to the outside world. Tor was even awarded the FSF's Award

for Projects of Social Benefit for its role in the revolutions in the Middle East (Sullivan, 2011).

Since the Snowden revelations about the surveillance program PRISM, a data mining program used by

the NSA and GCHQ to store Internet communications from companies such as Google and Yahoo, Tor

has been increasingly used by normal people seeking privacy online and wanting to prevent

government agencies from monitoring them. Munson validated this assumption by showing how Tor

usage has increased dramatically since the Snowden leaks, which he feels suggests that Tor is being

used by users seeking privacy online (Munson, 2013). Arma (2013) however contradicts Munson’s

conclusion stating that, due to the increase of “ESTABLISH_RENDEZVOUS requests", this growth is

instead likely to be from a botnet, although he also acknowledges some growth can be attributed to

activists in Syria and the United States.

The distribution of copyrighted digital material has moved from peer-to-peer to the “Darknet” (Wood,

2010, p. 1); this contradicts the Tor project message that users are only using Tor for good, and legal,

means. Wood believes users of the “Darknet” exploit the anonymity provided by Tor to illegally

download material without being able to be traced. Biryukov et al. (2013b, p. 2) strengthen the

argument that Tor is used for illegal purpose; they show that 44% of hidden services were hosting

illegal content such as drugs, pornography, illegal copyright material etc.

Arguably the most famous hidden service is Silk Road, also known as the eBay for drugs (Barratt, 2012,

p. 683). Reportedly, the Silk Road had between 30,000 and 150,000 active customers (Christin, 2012,

p. 2). The Silk Road was used to make significant numbers of transactions; Konrad (2013) reports that

the Silk Road’s owner and operator Ross William Ulbricht handled $1.2 billion of transactions in the

2.5 years before FBI seizure. This clearly shows how popular illegal activities are on Tor, greatly

contradicting the Tor project’s stance that Tor is used exclusively for good and instead showing that

Figure from Dingledine et al., 2004, p. 3

Figure 2 - Hidden service architecture

Tor is being used to conduct illegal activity anonymously, thus making traceability and accountability

extremely difficult.

Cox (2014) cites Bartlett’s radical views about Tor. Bartlett sets out to discover “the range of things

that people do under the conditions of anonymity” (Cox, 2014) and implies that, under the cover of

anonymity, people will do more distressing and ‘dark’ things, for example talking to “trolls on pro-

anorexia forums” (Cox, 2014). He goes on to suggest that human nature finds a haven on the Internet

and uses the tools available to enable this; in this example Tor allows these illegal and immoral

activities to be conducted anonymously.

While this view regarding the reason why Tor is being used for illegal purposes may be a radical one,

it does not hide the fact that a large percentage of Tor is used for illegal activities by criminals using

the blanket of Tor’s anonymity as protection from prosecution.

Current Tor development projects With Tor being open source, developers are free to use the source code to modify and develop their

own applications using Tor. As well as being able to modify the source code, there are libraries that

exist which allow developers to use Tor for their projects.

Stem, a Python controller library for Tor, requires Tor to be installed on the machine and controls Tor

through the control port (usually port 9051). Winter demonstrates how this can be used to create

circuits as well as stream in these circuits, although he implies the lack of features and functionality of

Stem inspired him to create his own application rather than using the existing Stem library (2014, p.

3). However Atagar praises Stem for the friendly API and documentation, while simultaneously

criticizing the lack of backward compatibility it offers (Atagar, 2012).

Txtorcon, previously TorCtl, is another Python controller library. It works in the exact same way, using

the control port to control Tor. TorCtl library was used for the development of the Torbutton Firefox

extension. Meejah (2014) describes how Txtorcon communicates with the Tor network as being “an

asynchronous API to speak the Tor client protocol in Python” and believes the main goal of Txtorcon

is to enable applications to use the Tor network to improve people's privacy and anonymity on the


Atagar draws a comparison between both libraries, praising both for their “extensive test suites and

are being very actively maintained” (Atagar, 2012) and noting that both just control the Tor client.

Current attacks on Tor With such a diverse client base, it is easy to see who may want to attack Tor, and their reasons for

doing so. For example, law enforcement will want to catch paedophiles who access hidden services

containing child abuse, whilst regimes such as China or Iran want to prevent users accessing Tor and

so may try to take down Tor altogether.

Denial-of-Service (DoS)

Wood and Stankovic (2002, p. 55) describe a DoS attack is “any event that diminishes or eliminates a

network’s capacity to perform its expected function”. Borisov et al. agree with Wood’s statement and

also claim that as well as the blanket DoS which effects the whole network, a selective DoS attack can

target just one small section of the network (Borisov et al, 2007).

Unlike Wood and Stankovic (2002, p. 55), Wang et al. (2004, p. 193) accurately describes a DoS attack

to be the flooding of a node with traffic such as requests until the node is unable to function.

Jansen (2014) details a DoS which implements a selective DoS attack. This attack targets either the

entry or exit node. The attack works by requesting data from a source such as a file server, and then

for the client to stop reading from the TCP connection, thus exploiting Tors control flow before

requesting more data from the source, causing the memory of the target node to increase, and then

the OS terminate the Tor application on the relay. Jansen (2014) acknowledges the attack, and its

effectiveness, however is critical of the attack; he discusses several simple defences that would

prevent this attack such as the implementation of authenticated SendMe cells to prevent the control

flow being misused.

De-anonymization of hidden services

Rob Jansen expands on his aforementioned DoS attack in order for it to be able to conduct the de-

anonymization of hidden services. Jansen expands on the attack first developed by Biryukov,

Pustogarov and Weinmann (2013). This attack requires the current guard of a selected hidden service

to be known and taken offline, forcing the HS to select a new guard node; this is repeated until the

attacker’s guard node is picked. However, this required the attacker to run a compromised guard relay

and Jansen is critical of this attack’s success. He suggests that if middle guards was used, then the

attack would not work. Jansen also criticises the attack’s method of taking a node offline, stating if the

node was correctly configured to prevent a sniper attack, then this would fail. He goes on to further

suggest that if the node were to simply reboot after it was taken offline, this alone would be enough

to render the attack ineffective (Jansen, 2014).

ASN (2013) frame this attack in a new light, instead of trying to prevent the attack, they suggest the

core design of Hidden Services is flawed and in need of a redesign if it is to continue to be secure and


Conclusion The review has enabled the developer to gain a solid understanding of what Tor is, how it works and

who it is used by. In addition, it has provided the opportunity existing Tor libraries to be reviewed. By

comparing Stem and Txtorcon it became clear that both offer the same level of functionality, but in

some areas this is at such a low level that they can be considered to be severely lacking, as they only

allow an application to control the Tor client and not control Tor directly. From this section it is clear

to see this is an area of research that is significantly lacking, and would benefit greatly from more

research and development.

The current attacks on Tor section is also extremely relevant to this project, this section analysed two

different attacks, a DoS attack and an attack which de-anonymised hidden services. This section in

particular featured analysis from several authors who were not in agreement over the effectiveness

of the attacks demonstrated, but all agree on one thing: Tor is vulnerable to attacks.

Overall this literature review has demonstrated that while Tor is a network that has been around for

over 10 years, and has been heavily researched, there are still many opposing viewpoints regarding

certain topics such as attacks. It has shown why Tor is more popular today than it ever has been, and

the impact that the anonymity of Tor has on the usage of Tor.

Tor development libraries were shown to be an area in which little research has been conducted, and

where the current solutions have been met heavily with criticism and opposing views.

3. Project Management

3.1. Brief description of chapter This chapter discusses the project and requirement elicitation methodologies that were used for this

project. It also considers the project’s potential risks and the countermeasures that were implemented

to negate these, as well as discussing the original schedule designed for the project.

3.2. Select methodology Project management has been used in various forms for centuries, but only since the 1950s has the

modern concept of project management existed (Kwak, 2003, p. 1). The origin of modern project

management is disputed, although there are records of it being used in the 1950s by US defence,

Xerox, Bell Laboratories and even NASA.

Modern project management can be defined as the “application of knowledge, skills and techniques

to execute projects effectively and efficiently” (PMI, 2014).

Since the 1950s, project management has become a necessity on all projects; it reduces the risk of the

project failing, or the wrong features being developed and ensures that the project is completed

efficiently and on time.

The Agile Method of project management was chosen for this project. The Agile Method favours

“working software over comprehensive documentation” (Paulk, 2002, p. 15). In this project the client

only expects a working application and this report to be produced, no other documentation is

required. This makes the Agile Method a good choice for this project, as it permits development to be

started immediately and provides the client with what they want, whereas methods such as the

Waterfall method focus a lot more on documentation, thus delaying the start of development.

Another reason for choosing this method was that it promotes adaptability throughout the project’s

lifecycle. Unlike other methods, such as the Waterfall Model, the Agile Method allows and encourages

changes to be made as and when they are required. This ensures development is continuous and will

not be stopped by a requirement that cannot be implemented. This method also allows for regular

testing and for working software to be shown frequently to the client; this promotes client feedback

any necessary changes can be immediately implemented with little cost to the design and

development of the project. This would not be possible with other traditional methodologies, in which

changes can be costly and can usually only be implemented at the end of the development. This

feature also mitigates risks and ensures that a quality application is produced; any bugs or issues are

identified within an iteration (a short timescale of around three

weeks) and can be dealt with at that point, rather than having

long-lasting effects on the application as would be the case with

the Waterfall model.

A further benefit of showing the client frequent iterations of the

project is that they get to see the progress that is being made,

which on a complex project allows them to gain a sense of the

challenges faced by the developer and to feel that they are

having an input in the application’s development. If Waterfall or

Spiral models are used, the client only receives the end product

and thus does not gain the same understanding of the project

and the issues that the developer faced. The Agile Method can

therefore be argued to lead to better customer relations.

Figure from: Stack Exchange, 2013.

Figure 3 - Agile methodology diagram

The Agile Method also promotes continual improvement of the application by taking positive, and

negative, aspects of the current iteration forward to be expanded upon in the next iteration. In other

methods, this can only be achieved at the end of development, ready for the next major release. This

ensures that positive features are capitalised on and promoted, optimising the quality of the


A further benefit of this method over other traditional methods, such as the Waterfall or Spiral models,

is that if the development is running behind schedule and the deadline is likely to be missed, it is

possible to liaise with the client and choose to focus on core requirements. Dropping any non-critical

requirements from the development plan ensures that a working application, albeit one with reduced

functions, can be handed to the client instead of having an application that is only half-developed.

This means that using this method increases the chances of the developer meeting the deadline by

having the project moving consistently forward and not grinding to a halt by trying to achieve goals

and requirements that cannot be implemented.

3.3. Requirements elicitation Well thought out, well-structured requirements generally lead to more successful project which meets

client expectations and the delivered application is fit for purpose (Hickey & Davies, 2002). It is

therefore important to ensure that the requirements gathered provide sufficient detail, are realistic

and achievable within the timeframe of the project.

“Requirements elicitation is the process of seeking, uncovering, acquiring and elaborating

requirements for computer based systems” (Zowghi & Coulin, 2005, p. 1).

This process may be time consuming, but it helps to prevent a project going over-budget, being

delivered late or failing to provide the required functionality (Jones, 1995, p. 86)

Using the Agile Method means that requirements will be gathered at the start of iteration. This allows

the requirements to take into account any issues faced or lessons learned during the previous iteration

and differs from the Waterfall Method, where the requirements for the entire project are thought of

prior to starting development. Using the Agile Method means that the requirements elicitation

method will need to be run several times throughout the course of the project. This makes choosing

an elicitation method critical; a method that takes a long time to get results would be an inappropriate

choice as it would delay the development of the project. For this reason a questionnaire would be

considered an inappropriate method of requirement elicitation for this project, as the time spent

waiting for replies to the questionnaire could eat into valuable development time.

There is more to requirement elicitation than the client simply telling the developer what they want;

a more detailed research process is required. This involves finding exactly what the client realistically

expects from the application as well as looking at similar projects that have been developed to see

what features these offer that could be integrated into the development of the application. A

combination of this information can be condensed into well-structured and achievable requirements

that can be implemented into the project.

The elicitation methods that were considered for this project are shown in the table below, along with

a summary of their appropriateness for the project:

Method Outcome Appropriate?

Interview the client (may be formal or informal, in person or via online correspondence)

Greater understanding of what is expected, issues with current solution etc., however the quality of answers is dependent on questions asked.

Yes – This is a core method that will be used, this will allow us to understand exactly what the client is expecting, the reasoning behind the project and the time scale in which it is required. Keeping in constant communication with the client will promote customer satisfaction.

Interview the end users (may be formal or informal, in person or via online correspondence)

Understand what the users actually want, if the client and end users expectations are the same, again the quality of answers is dependent on the questions asked.

Yes – another core method that will be used to see if the user wants the same features etc. as the client, also allows information to be gathered that could have been missed from the client interview.

Prototyping Rapid prototyping would develop a small section of functionality; this could be used to get feedback on the section from users or the client, and could be used to estimate a timescale for the project.

No – High cost of failed prototypes, not required due to the chosen agile methodology, however a prototype would allow evaluation of the proposed approach to development.

Case study A report which will allow the understanding of the current system/application An example of a case study model is the critical incident technique which observes the human interaction of the current system. (Woolsey, 1986)

No – This project does not build upon an existing application, and as case studies are almost always retrospective, it is not appropriate for this project.

Brainstorming Could be conducted alongside the interview process; a way for all ideas that may not necessarily have been discussed during the interview to be presented.

Yes – Will allow for the members involved to discuss ideas that may not be core to the project, but ideas that are revolutionary, or never before done, but would be welcomed if possible.

Figure 4 - Table showing possible elicitation methods

As the above table shows, three different methods were selected. These are interviews both with the

client and with the potential end-users of the application, as well as brainstorming sessions which will

be run in conjunction with the interviews. Using several different methods, and focusing on the end-

user as well as the client, ensures that the functions of the application meet the client’s expectations

and results in a usable application that meets the requirements of the end-user. By brainstorming with

the client, allows the developer to pick up on any implicit requirements that have not be explicitly

stated by the client but that would still greatly benefit the application and ensure the client is satisfied

with the final product.

3.4. Risk analysis As with any project there is risk involved, and although using the Agile method mitigates risk, there

are still some risks to the project. The table below shows the possible risks, their potential impact level

and finally what reduction strategy will be in place to ensure they are mitigated as much as possible

during the project development.

Risk Impact level Reduction strategy

Tor network unavailable – Internet access or the Tor network unavailable

High Use a Tor simulator such as the Tor Path Simulator (TorPS).

Poor productivity – Developer’s motivation inhibits the project’s development

High Set 20 hours a week minimum for the project, more when needed. Setting small milestones will increase motivation and productivity. Regular meetings with the client will ensure the milestones are met.

Technical risk – Project is too complex to implement

High Regular meetings with the client ensuring they are kept up to date with the development, and adjust the requirements to allow for a work around if possible.

Programmatic risk – Customer changes their mind about wanting the project developed

High Find another client or adapt the project to cater for the client’s change of heart.

Inherent schedule flaws – due to the uniqueness of the project, it is difficult to estimate and schedule.

Medium Better to overestimate than underestimate timescales; use the Agile methodology to renegotiate the schedule with client.

Requirements Inflation - more features that were not identified at the beginning of the project emerge that threaten estimates and timelines.

Medium Keep in constant contact with the client with regular meetings etc., only accept more features if timescale allows.

Specification Breakdown – Only during the development does a conflicting requirement become apparent.

Medium Contact the client, work out a solution that would have the lowest impact.

Insufficient resources – Unable to develop the project due to not having access to a required resource.


See if the resource is really required, look for ways to reduce resource use previously in the project, and try to gain the required resource.

Incorrect budget estimation – Overall cost of the project starts to increase and spiral


There is a budget of zero for this project, to maintain this open source software and libraries will be used.

Figure 5 - Table containing potential risks and countermeasures

3.5. Schedule The project has a hard deadline of September 12th 2014; at which point the application, all

documentation and the accompanying report must be completed. A Gantt chart showing the planned

schedule can be found in Appendix 3. As the chart shows, some additional time has been allowed to

factor in potential delays during the project. However due to the nature of the project and

methodology used, it is possible some iterations will take less time than others, and more or less

iterations can be added. This is a very adaptable schedule, and is only used as a base as it will likely to

change once development begins.

3.6. Professional issues Appendix 2 contains the ethical checklist that accompanies this project. This document revealed that

there were no ethical concerns raised by this project. To ensure copyrighted code is not used, only

open source libraries and code will be used and, should code be required from other sources, it will

only be used after getting the express written permissions from the author/owner. Should any

questionnaires or user feedback be required during the testing phases of the project then this will be

conducted anonymously, and all respondents will be under no pressure from the developer to take


User information will never be put at risk and the creation of this application will at no point

compromise users data or identity. Any attacks that are developed as part of this project will be

conducted on a closed network where the developer has complete control of the Tor node. This

ensures that users of the Tor network are not, at any point, affected by the development of this


Should a situation arise during the development of the application in which a potential professional

issue arises, this will be dealt with before it occurs to ensure that the project never breaks any ethical

codes or laws.

3.7. Conclusion of the section This section has justified the use of the Agile Method as a project methodology, having fully considered

its advantages and disadvantages over more traditional models, like the Waterfall model.

Furthermore, the importance of requirements was considered and the methods used to elicit the

requirements for this projects were discussed. Potential risks of the project were analysed and

countermeasures were implemented to mitigate the possible effects of these risks. Finally an

estimated schedule for the development of the application was drawn up, which factors in some

unforeseen delays during the development stages. With these aspects of project management in

place, a smoother development should be possible and the application should meet the requirements

and be delivered to the client by the deadline.

4. Application Development

4.1. Iteration 1

4.1.1. Requirements All requirements for this project will be split into two categories: functional requirements and non-

functional requirements. Functional requirements describe what the software should do whilst non-

functional requirements judge the operation of the software. By their nature, non-functional

requirements can be difficult to evaluate because they tend to be based on the subjective opinion of

the assessor rather than being fact-based.

During the requirements elicitation for this iteration, all of the non-functional requirements were

elicited. Unless otherwise stated, these will be presumed to apply to each iteration of the project,

although they will only be discussed in this section of the report. In the case of this project, the non-

functional requirements can be considered as principles based on the ISO 9126-1 software quality

model (ISO, 2001) which the project should aim to meet and can therefore not be attributed to one

specific iteration.

The functional requirements for this iteration were elicited using the aforementioned methods and

are shown below:

Requirement Importance Level

Connect to the Tor network High

Send and receive a version cell High

Decode NetInfo cell to extract data from it High

Handle errors from destroy cells Medium

Figure 6 - Table containing requirements for iteration 1

The non-functional requirements for this project can be seen in the table below:

Quality Characteristics Requirement Importance Level

Portability Able to run the application without installing Tor


Able to run on multiple platforms (Windows, Mac, Linux)


Not require the application to be installed to run


Reliability No more than 10 bugs on delivery High

Efficiency Use as little computational resources as possible such as RAM. (No more than a 1gb of RAM)


Usability No GUI High

Precise and constructive error messages High

Documentation High

Universal naming standard High

Dependability Able to operate normally or abnormally without threat to life or environment


Legal Only use open source software High

16 | P a g e

Maintainability Able to expand the system to incorporate new features, fix defects or deal with new technology.


Adaptability Able to change the system to handle additional domain concepts


Figure 7 - Non-functional requirements

The importance of considering both functional and non-functional requirements when developing the

application can be seen from the first functional requirement: to be able to connect to the Tor

network. Clearly, this is a critical requirement, failure to connect to the network will prevent the

project from being continued. One simple way to connect to the Tor network would be to install Tor

and allow the application to use the Tor client. However, this would inhibit the first non-functional

requirement: to not need to install a Tor client in order to use the application. Failure to consider both

functional and non-functional requirements during the development of the application could result in

some of the requirements being contradictory and thus not all of the requirements would be able to

be met.

The second functional requirement, to be able to send a version cell, requires a packet to be sent to a

Tor node informing it of the current version we wish to communicate using. This packet must fulfil the

criteria outlined in the Tor protocol specification document (Dingledine & Matthewson, n.d.). This

should be a simple requirement to achieve. This requirement is critical to the development of the

project as it sets up the communication between the client and a Tor node.

The third requirement, to decode a NetInfo cell, will likely prove to be challenging. The data contained

within the NetInfo cell must be extracted accurately and in the correct order.

The first three functional requirements were all considered to be critical; these requirements provide

the base upon which the application can be developed. Failure to meet these requirements at this

stage of the project could jeopardise the entire project as they provide key functionality to the

application. The fourth functional requirement - to handle data from a destroy cell - is also important,

but is not a critical requirement as, although it is desirable, it will not affect the application’s

functionality. Therefore, in this iteration, the first three functional requirements should be prioritised.

As already established, the non-functional requirements (NFRs) will affect the entire project and their

importance should not be under-estimated. The portability NFRs may seem simple to achieve, but

fulfilling these requirements will have major impacts on the project, and will, for example, have an

effect on the programming language chosen as it must be cross-platform compatible and be capable

of being used to achieve the functional requirements.

Usability, maintainability and adaptability NFRs should be simple to implement and can be said to be

of critical importance to the project. This project intends to create a library suitable for further

development, an application with poor usability features would not be chosen over existing Tor

libraries and therefore if this application is to be successful the usability NFRs need to be met.

The legal NFR of only using open source software needs to be achieved as the project has a budget of

zero. This is therefore a simple, but critical, requirement to implement.

The reliability NFR, to have no more than ten bugs on delivery, was explicitly mentioned by the client.

However, this could prove to be a challenging requirement to assess the success of. Whilst testing may

17 | P a g e

show that there are little or no bugs in the application, this might not be a true representation of the

application because there may be bugs in the application that did not show up during testing.

4.1.2. Design By using the Agile Method, the upfront design is minimized; the developer only designs what is

required for each iteration, which dramatically reduces the large upfront design cost that other

methodologies incur. Moreover, by only implementing the design as and when it is required, risk is

reduced and the developer ensures that all the necessary features are designed. Implementing the

entire design in one go could lead to features not being used etc. making it confusing to the end user.

This does not mean that features designed in earlier iterations will not be carried over to later

iterations of the application.

Despite being the first iteration, some design decisions made here will impact the rest of the

application. An example of a design decision that will affect the entire application is the programming

language used as this will not be able to be changed after the first iteration without dramatic

consequences. This makes the choice of programming language a critical design decision.

There were three key contenders for programming languages: C, Python and Java. C was discounted

as the author has considerably less experience in this language than either Python or Java. To decide

which of these languages was more suitable for this project, the advantages and disadvantages of

each were considered. Python was found to be the more suitable language for this project, as existing

Tor libraries use Python and it makes sense to use the language that Tor developers are already using

as it will help to achieve the application’s goal of being used for future development. Another reason

for choosing Python over Java was that using Python it is much easier and more effective to extract

bytes from packets of network data than it is using Java. Despite this, Java was a serious contender

due to the developer’s considerable experience in the language and the speed in which Java can run

– which can be up to ten times quicker than Python. The decision was further complicated by the fact

that both languages are cross-platform compatible and therefore would both be able to achieve the

non-functional portability requirements. Python is not without its disadvantages in relation to this

project; at the start of the project the developer was relatively inexperienced in this language, and

threading in Python is extremely hard and has been strongly criticised as being “fundamentally

broken” (Wittber, 2009). The deciding factor was that the client implied that he had a preference for

Python being used for this application.

The version of Python to be used was also seriously considered, with the final choice being Python 2.x.

Despite being the older version of the language, this was deemed the most appropriate version of the

language to use as several existing libraries anticipated to be used to provide functionality are

currently only fully compatible with Python 2.x. While some libraries have 3.x versions available, these

still contain bugs and tend to be considered to be in Beta mode.

The operating system used to develop the application is an unimportant decision, as the portability

NFRs state that the application must be compatible with all operating systems and Python can be used

on all operating systems. The only potential issue is that Python will have to be installed on Mac and

Windows operating systems, although it comes preinstalled on Linux. This also applies to the libraries

that the developer expects to use throughout the project. However, the developer’s personal

preference for developing applications is to use Linux and consequently this operating system will be

used to develop the entire application.

As discussed in the requirements and specification section of this report, the application does not

require a GUI as it would bring no benefit to the application. Designing a user-friendly, efficient and

18 | P a g e

scalable GUI would take considerable time and the absence of a GUI significantly reduces the

complexity of the design section. The time saved by not having to design and develop a GUI will be

invested into further increasing the quality of the code, as well using the additional time to try and

implement more of the requirements. Design features of a library

To achieve the usability NFR, it is important to consider the way that the library will be designed. A

poorly designed library would likely not be used for future development as developers would probably

opt for one of the existing libraries if it were significantly easier to use. It is therefore important to

ensure that design structure is simple to use, is intuitive and promotes efficiency.

To enhance usability, the single responsibility principle will be implemented; this means that each

component implemented in the library should only be responsible for a single section of functionality

or a single feature. This makes it easier for the user to understand precisely what they can expect from

each function of the application, which should help to make users feel confident in further developing

the application in the future.

Two popular naming conventions are used for Python, these are mixed case and lower separated with

an underscore. The Python PEP 8 documentation recommends that the words be separated by an

underscore as it is claimed that this facilitates readability (Van Roussum, 2014). Therefore, this naming

convention was chosen as it would further achieve the usability NFR.

The names of the relevant functions and variables also needed to be considered. Variable names such

as x, y, etc. are extremely poor names - they do not give any information about the data they contain.

It was decided that all names should provide as much data as possible whilst remaining a sensible

length. This will facilitate easy development and usability as there should be no confusion over what

a variable contains or what a function will do; the name should make this information clear to the


Both the chosen naming convention and the descriptive variable names help to meet the NFR of

maintainability – making it easier for the current developer to work on the application as well as for

users to further develop it in the future.

To further achieve the usability NFRs, detailed comments about all functions within the application

will be required. These should provide the user with information concerning the required input, what

the function does and what the function will return.

It could be argued that the above features are not strictly necessary as a good application would

always be favoured over a lesser application, however the amount of effort required to make these

significant improvements is negligible and the implementation of these decisions could potentially

increase the speed of development by making it easier for the developer to identify pre-existing

functions. Version control

Version control is an essential feature to be implemented in the application. Although it will not affect

the development of the project, it is a safeguard that means were anything to go wrong the code can

be retrieved from a specified point. It also offers the ability to track all changes made to the code,

which will help locate bugs within the application. For this project, Git was selected over Subversion.

This decision was based on the personal preference of both the developer and the client, who also

uses Git and therefore it was easy to share code between the parties involved in the application’s


19 | P a g e Design conclusions

This section has required more time than was previously anticipated, this was because so many of the

design decisions that needed to be made in the first iteration would have effects upon the entire

development of the project. It was therefore essential that sufficient thought and consideration was

put into these decisions, as failure to make the right choice would lead to greater delays later in the

project development.

It could also be argued that creating such a detailed design in the first iteration will speed up the

development process and ensure that potential issues will be averted as a result of the decisions made

in this section.

4.1.3. Implementation The aim of this iteration was to implement all four functional requirements, as detailed at the start of

this section. Furthermore it was hoped that as many of the NFRs as possible would also be achieved

during this time.

Developing the application required several pieces of software to be selected. The most important

tool was the code editor Sublime Text 2, this is a text editor which enables code to be written. While

it is argued that an Integrated Development Environment (IDE) is more appropriate for developing

code, due to the extensive testing and debugging functionality that they provide, they are more

complicated to use than a text editor and the testing environment may not be suitable for this

application. It is also the developer’s preference to use a text editor, as he has more experience of this

method. By using the extensive testing functionality that Python provides, no negative effects of using

a text editor over an IDE will be present in the final application.

The developer tried to implement the functional requirements in order of their importance; for

example, connecting to Tor was the first step undertaken.

To do this an SSL connection was made to the Tor node. It used the Tor node’s IP address and ORPort.

This was easily implemented and only required the three lines of code shown below:

Figure 8 - Code snippet showing connection to a Tor node

While this is a simple method to connect to the Tor node, it works and is less complicated than other

methods and it was felt best to avoid over-complicating things where possible. However, an

improvement was almost immediately thought of. This method requires the user to know the IP

address and ORPort of the Tor node, which may not be easy to find out. To simplify the method, and

increase usability, it was decided that users should be able to enter either the nickname or the IP

address and ORPort of the node that they want to connect to. This was not implemented during this

iteration, as it was felt that this should be suggested to the client at the end of this iteration and, if

approved, implemented in the following iteration.

The second requirement, to be able to send and receive a version cell, was the next requirement to

be implemented. It was decided that, because all cells need to be created in the same format and

following the same protocol instructions, a function would be created to automatically pack a cell to

20 | P a g e

the correct format, thus preventing any code duplication. This achieves the usability and

maintainability NFRs. A build cell function was therefore created, which takes the command to be

used and the payload and correctly packs this into the correct format of the cell. This is shown in the

code below:

21 | P a g e

The decoding of the NetInfo cell was perhaps the most challenging requirement to be implemented

in this iteration. This was because the developer is still relatively inexperienced with Python and the

Tor protocol documentation is not very clear and contains several ambiguities. However, despite these

challenges, the NetInfo cell was able to be decoded, although the process overran the estimated

timescale dedicated to this section as a result of the aforementioned challenges.

To promote efficient code, the developer used an ‘If’ statement to dynamically extract data contained

within the packet. The NetInfo cell could contain multiple IP addresses or multiple formats of IP

addresses (i.e. IPV4 or IPV6), an appropriate but somewhat inefficient method, would be to run

multiple ‘If’ statements for every possible eventuality. This, however, would mean at least eight ‘If’

statements would be required just to extract the client’s IP address. As the code below shows, the

developer managed to use a single ‘If-elif’ statement, by doing so dramatically reducing the chances

of errors in the code and increasing readability for users.

Due to the complexity of decoding the NetInfo cell, this iteration was already starting to fall behind

schedule. The decision was therefore made not to implement the handling of the destroy cell as part

of this iteration as it does not affect the core functionality and was merely a desirable, rather than a

core, requirement. However, it was mentioned to the client and it was agreed that this feature will be

implemented in a future iteration.

4.1.4. Testing Although testing may “Often feel like an exercise in futility or at best a waste of time” (Arbuckle, 2010,

p. 1), it is a critical area of development. Testing ensures the software functions according to the

expectations defined by the requirements/specifications. The overall aim of testing is to find bugs or

issues that would negatively affect the functionality of the application, its usability and/or


22 | P a g e

For this iteration functional testing, which verifies a function performs as expected using a small subset

of inputs as well as white box testing, where the tester has full knowledge of the implementation will

be conducted.

To enable the application to be thoroughly tested, several testing methods were considered. It was

decided that a combination of the unittest framework and testing manually were the most appropriate

methods of testing for this application. This is because unittest makes it possible to quickly test a large

number of input values and it is also heavily integrated with Python. The results from unittest are also

displayed in a very comprehensible manner, making it easy to locate and fix bugs. Manual testing also

has its advantages, such as being able to test features of the application that a unittest might not be

able to do and manually testing each function will allow a realistic user scenario to be tested.

To ensure that there are no anomalies in the results, for each testing round each test will be run three

times. Should any issues be presented, these will be investigated and corrected before re-running the

tests to ensure the bugs have been removed. This process will be continued until all bugs are

eradicated from the application.

Test No. Test Test method Succeeded? Comments

1 Can connect to a node Unittest Yes

2 Passes correct value to version function

Unittest and Manually


3 Creates the correct version cell

Unittest Yes

4 Sends the version cell Manual Yes

5 Receives the Netinfo cell

Unittest Yes

6 Able to extract the payload of a Netinfo cell

Unittest Yes

7 Successfully able to extract the data contained within the payload

Unittest Yes

8 Store the extracted data as a dictionary

Unittest Yes

9 Create the payload of a NetInfo cell to be sent

Unittest and Manually

First round: No Second round: Yes

First round of testing showed up an error where the IP addresses was being displayed as negatives, this was because there were not being formatted correctly, once this was fixed, the test was able to be passed

10 Builds the NetInfo cell correctly to be sent

Unittest and Manually


11 Send the NetInfo cell to the first node

Unittest and Manually


Figure 9 - Testing results for iteration 1

As can be seen in the testing results in above, eleven tests were conducted for this iteration. All

functions were thoroughly tested, with ten tests being passed first time. One test, however, failed.

23 | P a g e

This test was to ensure that the correct payload of the NetInfo cell was created. The creation of the

NetInfo cell proved to be incorrect as negative IP addresses were being passed. This obviously cannot

be allowed, and was found to be a result of the formatting of the IP addresses had been done using

the signed char method rather than the required unsigned char method. Once this had been changed,

the test was rerun and was successfully passed.

4.1.5. Moving forward from Iteration 1 While this iteration has overrun the allotted time by two weeks, and not all of the functional

requirements have been met, for the most part it can be considered a success. A new project plan has

been created to showing this, and how this delay has been taken into account for the future iterations,

to still insure the project is completed on time, this can be found in appendix 4

They delay is because a detailed design section was developed and this should enable future design

development to be achieved quicker and more efficiently. The three functional requirements that

were implemented have been implemented successfully and to a high standard, for example the use

of functions to reduce code duplication was implemented to help achieve many of the usability NFRs.

The majority of the NFRs have already been achieved, which is a significant achievement in such a

small amount of time.

The testing of the implemented features was a success, despite one function requiring a bug to be

dealt with. Moving forward to iteration 2, the recommendation of using an Onion router nickname as

well as the IP address to connect to a Tor node will be suggested to the client and the timescale will

be altered to take the complexity of the Tor protocol documentation into account.

24 | P a g e

4.2. Iteration 2

4.2.1. Requirements During the demonstration of the previous iteration to the client, he was generally pleased with the

development to date. The unachieved requirement of handling data in the destroy cells was

mentioned to him and he explicitly requested that this be completed in this iteration. It was also

decided to implement the use of Onion router nicknames to identify Tor nodes in addition to the

existing IP addresses to facilitate usability. This discussion, as well as several informal interviews

conducted with potential end-users of the application, elicited the following requirements.

Requirement Importance Level

Create a circuit through the Tor Network High

Create a circuit of any length High

Create a stream through Tor to a web server High

Able to retrieve webpages from an internet web server through Tor


Create a circuit using specified nodes Medium

Create multiple streams through Tor to a web server Medium

Handle errors from destroy cells Low

Figure 10 - Functional requirements for iteration 2

It was evident that no further NFRs needed to be added to the original specification and that the

existing NFRs should be carried forward into this iteration.

The requirement of creating a circuit through the Tor network is perhaps the most challenging

requirement faced in the project to date. To achieve this requirement, the calculation of shared keys

between nodes will need to be achieved. The encryption of packets will also need to be implemented

if this requirement is to be achieved. This is extremely difficult and the Tor documentation is, once

again, full of ambiguities and proves a major challenge to developers.

In light of this challenging requirement, the predicted timescale has increased from three weeks to

four and the hours allocated to the project have been increased in order to develop this aspect of the


However, once this requirement is implemented it will provide a base upon which the application can

be developed. It is therefore critical that this requirement be fulfilled during this iteration.

The requirement of being able to create a circuit of any length should be easily achieved once the

requirement of being able to create a circuit has been successfully implemented, as it expands on the

code used for this process.

The creation of a stream through Tor to a web server is also likely to be a simple requirement to fulfil,

as this will, once again, expand on the circuit that has been created. Overall, this iteration contains

some very difficult requirements to achieve, with the majority of the requirements being dependent

on the successful creation of a circuit through Tor.

4.2.2. Design The design for this iteration builds on the design section from the previous iteration, however one

small design feature was needed to be considered before implementation. With the greater use of

cells, all requiring different commands to be associated with a specified cell, it was important that the

25 | P a g e

method used to identify the cell was clear. There was two possible solutions, to use the command id

number of the cell or to use the English command. This was thought at great length, as by

implementing the English command for the cell command would make the code easier to use,

however it would also increase the chance of errors being implemented into a program, I.E a spelling

mistake. Also with so many different commands with similar names this might make it more confusing

to the user to decide on which command to use. It was therefore decided to implement the number

based commands for setting the commands in packets, due to the lower risk of a user entering the

incorrect number, reduction in the chances of errors due to spelling mistakes, and being easier to

implement as no formatting issues need to be considered. It was also chosen above the English string

version as this is what is currently used in the Tor protocol.

4.2.3. Implementation The key requirement for this iteration was to build a circuit through the Tor network. This needed to

be split into two parts. The first challenge was to connect to the first node, and once this was successful

the developer had to connect to the second and subsequent nodes. This was split into two parts

because the cells needing to be sent are different. For example, the cell sent to the first node needs

to be a ‘create’ cell, and a ‘created’ cell would be expected back. For the second node onwards, an

‘extend’ cell would be sent and an ‘extended’ cell would be received back.

Here an implementation decision needed to be made. Tor currently uses two encryption methods

(Dingledine & Matthewson, n.d.): the NTOR and TAP protocols. There is no particular advantage to

either method, however the TAP handshake is slightly easier to implement and is the original Tor

encryption method which means that it is able to be used on nodes running older versions of Tor,

whereas the newer NTOR protocol may not. In addition, more documentation is available for the TAP

handshake and for these reasons this was chosen as the encryption method to be used in the project.

To create a ‘create’ cell, the Diffie-Hellman protocol must first be used to create a shared key which

only the client and the first node knows. To calculate the client’s data for the handshake to be

completed, a function was created to prevent code duplication as this will be required to be used

extensively for circuit creation throughout the application.

The code below shows the DH protocol being conducted, with the creation of x (the private key) and

X the public key that will be sent to the Tor node. The public key is encrypted with the onions remote

key. To prevent an errors being implemented into the application the decision was made to use a

hybridEncryption that has already been created by Dr Gareth Owen. This was decided over creating

our own because the time for this iteration was fast running out due to the complexity. It allowed

more time to be spent on other sections rather than trying to re develop something that already has

a proven success rate. Finally it was chosen as this is thoroughly tested, and thus will not introduce

any bugs in the application regarding its use.

26 | P a g e

Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys

Once the payload and the client’s half of the Diffie-Hellman key had been calculated using the above

function, the packet was created using the build cell function as previously implemented, thus

dramatically reducing code duplication. The client’s private key x, is important to be stored as this will

need to be used to decrypt the packets that is received. For this a variable within the TorCircuit class

has been used. This method of storing the key was chosen over storing all keys in an array for example,

is this makes the key easier to be used later on in the application, with less chance of error.

The Tor circuit class is shown below:

Figure 12 - Code snippet showing the Tor Circuit Class

It was then important to receive and decode the received ‘created’ cell as this would complete the

handshake and provide the server’s public key, and shared key. This means that packets could be

encrypted to the first node, the first step of the circuit. To retrieve the data contained within the

packet, the payload of the data was extracted and, by using the Tor documentation, which states that

where in the packet each piece of data is located, we were able to extract the public key, the derived

key data as well as a unique key shared between the client and the first node.

27 | P a g e

This is shown in the code below:

Figure 13 - Code snippet showing the decoding of a Create cell

As shown in the code above, the calculated KH, Df, Db, Kf and Kb are returned to TorHop. TorHop is a

class which handles the creations of the circuits. By being able to save the values of these variables in

the class, they will be able to easily call later on, when they will be needed for encrypting and

decrypting packets.

This method of extracting the shared key data, as well as various other keys, could be reused for

decoding received ‘extended’ cells, as these share the same methods of encryption. The only

difference is that the ‘extended’ cells would first require decryption as they would be encrypted with

other Onion node keys.

The method to decrypt the cells is shown below:

Figure 14 - Code snippet showing the decryption function

This was extremely challenging to implement as it required the public and shared keys of the node as

well as other encryption variables to be correctly stored as these would later be used in decryption. It

28 | P a g e

was vital to get these in the right order, as otherwise the packets would not be correctly decrypted or


During the implementation of this requirement, the Tor protocol became a hindrance to the

development. This was mostly down to the feature of Tor that does not send back a cell if an

incorrectly configured cell has been sent, thus not giving the user any feedback about what has gone

wrong. It was also discovered that, if a cell was received, it would a be a destroy cell that contained

little to no information, or finally a relay cell that could not be decrypted to provide any useful

information due to not extracting the data properly. This was a very frustrating time for the developer

as many days were spent trying to chase down an error without knowing where to even start looking

for it.

After several weeks chasing down errors, a circuit can finally be created and thus the first two

requirements of this iteration were successfully completed. By this point, however, the project was

running behind schedule – this can be attributed to the complexity of Tor and the accompanying

documentation. Fortunately, from this point onwards, this iteration’s remaining requirements proved

simple to implement as creating a stream used the circuit and encryption previously implemented and

it was simply a case of sending a specially configured relay cell through the Tor network. This is shown

in the code below:

Figure 15 - Code snippet showing the creation of the Stream cell

As shown in the above code section, the creation of a stream simply required the host name or IP

address along with the port. It is then passed to the build cell which builds a relay cell containing the

data, before being encrypted with the selected nodes encryption keys and sent. Once again this was

created as function within the TorCircuit class.

The regularly received destroy cells provided the perfect opportunity to achieve the requirement

carried forward from Iteration 1 - to handle the destroy cells, which now inform the user of the cause

of the error.

29 | P a g e

Figure 16 - Code snippet showing the conversion of error codes to English

The above code snippet shows how the error code contained within the destroy cell is passed to the

above function, comparing this code to a dictionary containing all error codes and there meaning

before returning to the user the English error.

This method was chosen over just simply informing the user that a destroy cell has been received, is

because this aims to achieve the NFR of usability. By providing the user with a more detailed and in

depth error message, will allow for easier debugging.

4.2.4. Testing As with the first iteration, testing will be conducted using unittest and manually. However unlike the

first iteration which only looked at functional and white box testing, this testing section also needs to

consider Integration.

An Integration test verifies that the parameters passed between modules are handled correctly, used

when a module is developed at a later stage than the module it is interacting with. This is an important

testing area to complete as this will ensure functions that were development in the first iteration are

capable of being used for functionality development in this iteration. Although the effectiveness of

this test can be argued, with some suggesting it is a waste of time, the little time it adds to the testing

makes it worth it, especially if a bug is found as this can be quickly fixed, and thus not affecting

functions later on during the development that may use it.

The testing strategy can be seen below, this shows the test to be conducted, how it was conducted

and if the test was successful. As with the first iteration in each round, the tests was run three times

to ensure no anomalies where present in the results.

Test No. Test Test method Succeeded? Comments

12 Connect to the first hop Unittest Yes

13 Calculate the shared key between client and node


14 Create a CREATE cell containing the relevant data

Unittest Yes

15 Send the CREATE cell to the first node.

Unittest Yes

30 | P a g e

16 Receive the CREATED cell back from the first node

Unittest Yes

17 Ensure a cell of cmd 3 is received

Unittest Yes

18 Extract the payload of the CREATED cell

Unittest Yes

19 Extract the first node half of the key

Unittest Yes

20 Calculate KH, Df, Db,Kf, Kb from the payload

Unittest Yes

21 Check the derived key data is the same as KH

Unittest Yes

22 Calculate the shared key

Unittest Yes

23 Ensure the nodes entered in the array are correctly passed to the function

Unittest / Manually

First round: No Second round: Yes

Failed the first round do to being passed as a single value for all nodes, rather than a value for each node, this was a simple change to make and by doing so it passed the second round of testing.

24 Search the consensus by a nodes nickname for their IP address and OR port

Unittest Yes

25 Packs the IP address and OR port of a selected node in the right format

Unittest Yes

26 Calculate the shared key half

Unittest Yes

27 Build the EXTEND cell Unittest Yes

28 Correctly encrypt the packet

Unittest Yes

29 Ensure the packet count is correct

Unittest Yes

30 Send the packet to the correct node

Unittest Yes

31 Receive an EXTENDED packet back

Unittest Yes

32 Handle a destroy cell correctly

Unittest Yes

33 Extract the payload of the EXTENDED packet

Unittest Yes

34 Decrypt correctly the payload of an EXTENDED packet

Unittest Yes

35 Ensure a RELAY_EXTENDED cell is received

Unittest Yes

31 | P a g e

36 Calculate Shared key Unittest Yes

37 Extract derivative key data from the payload

Unittest Yes

38 Ensure KH and derivative key data are the same

Unittest Yes

39 Ensure KH, Df, Db, Kf, Kb are updated to the TorHop object

Unittest Yes

40 Ensure a stream can be created to the a specified webserver

Unittest / Manually

First round: No Second round: Yes

This failed first time due to incorrect formatting of the target webserver, but by correcting the ip address or web address and correct port, this issue was able to be solved, and the second test was passed

41 Ensure the payload of the stream packet is correctly formatted

Unittest / Manually


42 Correctly create the stream relay cell

Unittest Yes

43 Correctly encrypts the packet to allow it to be sent through the network

Unittest Yes

44 Ensure a packet is received back from the Stream request and handled appropriately

Unittest Yes

45 Ensure a RELAY_CONNECTED cell is received

Unittest Yes

46 Check the data (GET request) is correctly formatted to a packet

Unittest / Manually


47 Ensure the packet is encrypted correctly

Unittest Yes

48 Ensure a packet is received back and handled appropriately

Unittest Yes

49 Ensure all the data is received

Unittest / Manually

First round: No Second round: Yes

In the first round only a single packet was received and did not contain all the data. This showed we must look for more than a single packet, which was implemented by using a while true loop, which allowed this to pass the second round of testing

Figure 17 - Testing results iteration 2

32 | P a g e

As shown from the above test results, there were several tests that failed first time. This was to be

expected on such a complex iteration, however fortunately the three tests that did fail were easily

corrected. For example test 49 - Ensure all the data is received failed as it was wrongly assumed all

data would be received in a single packet, once this was found not to be the case, a simple loop was

implemented to ensure all packets was received. Once completed the test was re run and the test was


Overall the tests that failed, was not due to issues in the functionality of the application, but rather

developer error, this shows the importance of testing so issues such as these can be picked up early

on during development and fixed.

4.2.5. Moving forward from Iteration 1 As with the first iteration this iteration also over run the predicted timescale but two week, it will

therefore be necessary in the next iterations to increase the amount of time dedicated to the

development. A new project plan has been created to showing this, and how this delay has been taken

into account for the future iterations, to still insure the project is completed on time, this can be found

in appendix 5.

The main reason for the delay is due to the project being much harder and trickier than first expected;

the lack of error messages provided by the Tor is proving to be the most difficult feature of the Tor

protocol to handle and days have been spent trying to debug the software, despite not knowing what

is going on. With normal development of an application, if something goes wrong an error message

would be presented to the developer to indicate the area where the error occurred, but with the Tor

protocol, this does not happen. A further delay is caused by the confusing Tor protocol

documentation, which contains many inconsistences, making understanding exactly what is required

for each function difficult to understand, and on several occasions help from the Tor community has

been required to understand certain points.

However all requirements in this iteration was completed, including the requirement that was not

completed in iteration 1. All functional requirements have been completed, while also satisfying the

NFR of maintainability and usability. It would have been quicker to develop the functions without

considering these requirements, but by considering those during development will ensure an

application that meets and exceeds the client’s expectation.

4.3. Iteration 3

4.3.1. Requirements During the demonstration of the second iteration to the client, the response was generally positive as

they had an understanding of just how complex development has been so far with issues such as the

poor Tor documentation and lack of error messages seriously hampering development. When

questioned about how the development will be improved, the client was satisfied with the suggestion

of increasing the amount of time spent on development.

No requirements were left outstanding from previous iterations, so by using the same requirements

elicitation methods as used in previous iterations, five requirements were put in place for this


Requirement Importance Level

Retrieve the three responsible HSDirs nodes for

a specified hidden service


Create a circuit to a HSDir server responsible for

the selected hidden service


Retrieve the service descriptor for the selected

hidden service


Download and save the service descriptor High

Connect to a selected rendezvous point High

Figure 18 - Requirements iteration 3

The requirement of selecting a rendezvous point should be relatively simple to complete as this will

use the circuit functions in the previous iteration. This requirement is also independent from the other

four requirements and will therefore be the first requirement to be implemented. This is because the

developer’s time will be more effectively spent on the other four requirements which are predicted

to be the more challenging requirements to implement.

The other four requirements are all dependent on each other. For example a circuit cannot be created

to an HSDir unless the HSDir is known. However, creating the circuit to an HSDir should again be a

relatively simple requirement to implement as it uses functionality that has previously implemented

in the application.

The retrieval of the three responsible HSDirs for a selected hidden service is likely to be the hardest

challenge to date. It was suggested to the client that another application be used to find this

information out as this would dramatically simplify this requirement, however they explicitly

requested the application should be able to calculate these.

4.3.2. Design The design for this iteration continues to build on the design section from previous iterations, however

during this iteration, a hidden service descriptor will also be downloaded; this file will be need

decrypted so that the data within it can be used.

Several methods are appropriate to store the file for use later on. These include storing the file in a

variable such as an array, or creating a text file and storing the data in this.

Both methods were considered at length. Storing the data in a variable method was strongly

considered, as it does not require any more computing resources such as creation rights, and will be

simple to retrieve at a later date. However, the decision was made to instead store the data in a text

file. This was chosen because a service descriptor is valid for 24 hours, and the application could use

the stored descriptor at a later time if it is still valid and the application was shut down. This would

not be possible if it had been saved it in a variable method. It was also predicted to be simpler to

extract the relevant data from within the text file rather than using a variable.

The issue with using up memory on the client’s machine was also addressed; when the application is

finished, it will remove the file ensuring that, once the application is closed, the client is left with the

same amount of memory as they would have had before running the application.

It was also decided that because these requirements now aim at offering different functionality, a new

directory should be used for these function. This would help achieve the NFRs of usability and


4.3.3. Implementation The simplest requirement, to create a rendezvous point, was implemented first because this could be

quickly implemented, did not depend on any other requirements and fully used previous functionality

that was implemented.

To create a rendezvous point a rendezvous point needed to be chosen, this could be any Tor node,

and a circuit was then created to it. The only difference to a standard circuit is that, once connected,

a rendezvous cookie needed to be sent, this is a random 20 byte piece of data. There are many

methods that could achieve this, one could be to send a static 20 byte data, however this would not

be appropriate. This is because the Tor specification suggests it should be unique, and if the

application needed to create two rendezvous points to the same node, sending the same rendezvous

cookie would result in only one rendezvous point being created, therefore it was decided to create

the cookie completely randomly, and to use the TorCircuit class to store it for a later date, as with the


Figure 19 - Code snippet showing the creation of a rendezvous point

As the above code shows, the rendezvous cookie was simply sent in a relay cell, as created in a

previous iteration.

To retrieve the hidden service descriptor for a specified hidden service, we must first find the Tor

nodes responsible for hosting that hidden service descriptor.

Nearly a week was spent developing a function that was capable of this, however there were several

errors in the function, and due to the lack of error messages and feedback from the Tor protocol the

errors could not be tracked down, and, with time rapidly diminishing and the client expecting results,

it was decided to research a existing solution that could be used in its place.

Donncha O'Cearbhaill, a respected researcher in the field of Tor hidden services, has created a script

to retrieve hidden service directory servers for a hidden service. Although his script did not meet our

exact requirements, it was adapted and modified in order to retrieve all the hidden service directory

servers for a selected hidden service; using his function as a starting point, a complete function was

able to be created that will retrieve the list of all hidden service directory servers for a selected hidden

service. This allowed development to carry on, and since this single function had caused another delay,

taking more than two weeks, it was decided that more time would be dedicated towards the project.

This single function proved to be the most challenging aspect of the project to date.

Once the HSDir servers were known the final three requirements could be achieved. The creation of a

circuit to a responsible HSDir simply used functionality implemented in a previous iteration to create

the required circuit. This was chosen over making a function especially for the HSDir circuit creation

so as to reduce code duplication and make error checking simpler, as well as achieving many of the

NFRs set out at the start of the project.

To retrieve the service descriptor from the HSDir a stream was set up to one, which once again reused

code from previous iterations. The only difference is the format of the GET request that needed to be

sent. This had to contain the IP address and the Port numbers of the responsible HSDir. To again

achieve the project’s NFRs, the consensus was searched for these details, as this was the only method

that would be able to find the IP address and Port for a node based on its nickname.

Figure 20 - Code snippet showing how the service is downloaded and saved to a text file

The above code shows how a text file was created in preparation for the service descriptor. Using

previously implemented code to handle the stream data, code duplication was reduced; the

application writes each packet payload received to the text file.

It is important to make sure the file is open and able to be written to, as there was an issue during

development that would not allow the document to be written to the file, this was because it was not

opened with the correct permissions beforehand. A further challenge encountered during

development was that the document was not being fully received, this was due to the application only

expecting to receive a single packet of data, but the size of the document meant that multiple packets

were being received. The above code shows this implemented and how the application looks for a

RELAY_END packet to ensure all data has been received.

Many methods were tried to ensure that the entire document was downloaded without corruption,

but unfortunately at present there are no features within the Tor protocol that allows for this checking

to be carried out, so there is always a risk that the received data is corrupted.

4.3.4. Testing Due to high reuse of code in this iteration, which allows the NFRs of maintainability and usability to

be achieved, the testing below focuses on Integration testing to ensure that the parameters passed

between modules are handled correctly.

As with the previous iteration, testing will be conducted using unittest and manually.

The testing strategy can be seen below, this shows the test that was conducted, how it was conducted

and if the test was successful or not. As with the first iteration in each round, the tests were run three

times to ensure no anomalies were present in the results.

Test No. Test Test method Succeeded? Comments

50 Able to calculate all the descriptor ID’s for a specified hidden service

Unittest / Manually


51 Able to find the responsible HSDir servers for the calculated descriptor ID’s

Unittest / Manually

52 Create a rendezvous cookie Unittest / Manually


53 Calculate the IP address of the HSDir servers

Manual Yes

54 Calculate the DIR port of the HSDir servers

Manual Yes

55 Calculate the OR port of the HSDir servers

Manual Yes

56 Calculate the nickname of the HSDir servers

Manual Yes

57 Calculate the identity of the HSDir servers

Manual Yes

58 Create IP address with port for each responsible hidden service directories

Unittest / Manually


59 Create a stream to a responsible HSDir

Unittest / Manually


60 Send a GET request to request the service descriptor document

Unittest / Manually

First round: No Second round: Yes

Failed first round of testing due to the incorrectly formatted request, this was corrected and the second round was passed

61 Receive the entire service descriptor from the HSDir

Unittest / Manually


62 Create a text file to store the service descriptor in

Unittest / Manually


63 Save the retrieved descriptor in a text file

Unittest / Manually

First round: No Second round: Yes

Issues flagged up during the first round of testing, while the file was created, nothing was being written to it, this was because of a permissions issue, once this was fixed, it passed the test

64 Create a circuit to the rendezvous point

Unittest / Manually


65 Send the rendezvous point data to the induction point

Unittest / Manually


Figure 21- Testing results iteration 3

As shown in the above table the majority of the tests were passed first time. This was to be expected

at this iteration stage, where the code is starting to rely heavily on previously implemented

functionalities. The two tests that failed were not major failures. The first test failure was due to the

GET request being formatted incorrectly. This was the result of a simple typing error by the developer,

and once the problem was identified, the issue was fixed and the test was passed.

Test 63 however was a more serious issue. It was obvious straight away why the data was not being

saved to the document. Putting in various print statements to show the data received clearly showed

that the application was receiving data, and that the application had opened the text file. It was only

by thoroughly looking at the code that the error was found; this error was that no write permissions

had been granted to the application to write the descriptor to the text file. Once again this was fixed,

retested and this time all tests were passed.

4.3.5. Moving forward from iteration 3 Overall this iteration has been a success, with all requirements outlined at the start of the iteration

being implemented.

However due to time constraints and the complexity of Tor, the decision was made to modify existing

code in order to complete one requirement. This code was open source and so did not require the

original developer’s permission in order to be used. By choosing to implement this, the application

could move forward and carry on being developed. This will be mentioned to the client in the next

round of discussions about how the ambiguities between Tor protocol documents and lack of

feedback from Tor is making the development considerably more difficult than previously expected.

Design decisions have been made in this iteration that will effect later iterations; these have been

heavily evaluated and are judged to be most appropriate for the use. However this has been

implemented in such a way that if at a later stage the method for storing the descriptors is no longer

considered appropriate, then this could be changed without too much redevelopment being


Testing proved to be a success; all tests were passed after just two rounds although this stage of the

iteration caused the developer some concern. It showed that the lack of attention to detail in the

application was causing simple bugs that should have been avoided, and these were causing a

detrimental impact on the application. The client will be informed of this and asked if the timescale

for the next iteration could take into account double checking as, in the rush of implementing

functionality, it is being done so at the cost of completeness, thus affecting some NFRS.

4.4. Iteration 4

4.4.1. Requirements During the meeting prior to commencing this iteration, the client announced this was to be the last

iteration required for the development of the Tor library. The requirements set out for this iteration

are shown below; these are the final requirements needed for this application to achieve the same

functionality as the official Tor client.

Requirement Importance Level

Decrypt the service descriptor High

Extract data from the decrypted document High

Select an induction point and create a circuit to it High

Create a stream to the hidden service High Figure 22 - Requirements for iteration 4

The reason why there are only four brief requirements in this iteration is because the client has

requested that this iteration focus on testing. However that does not necessarily mean that the

requirements set out are going to be easily achieved. All four requirements are dependent on one

another; for example if the first requirement – to decrypt the service descriptor - fails then none of

the other requirements can be implemented.

To decrypt the service descriptor data will not mean simply decoding the entire document using a

base 64 decoder. This is because only certain sections of the document are base64 encoded, and it is

important to only extract the data required by the application, as it will be a waste of computational

resources to spend time extracting and decrypting data that will never be used.

Once this requirement has been achieved data can then be extracted from the decrypted service

descriptor, although even here there will be data that needs to be base64 decoded. The method used

to extract the data from this document will likely be the same as used in the first document.

The reason why these requirements are so critical to the development, as the data contained within

the service descriptor contains all the information about the hidden service’s induction points. These

induction points are nodes which communicate directly to the hidden service and, without being able

to connect to these nodes, the application will not be able to talk to the hidden service and proceed

to the next requirements.

Once these two requirements have been completed, the remaining two requirements should be easy

to implement as once again these will use previously implemented functionality.

4.4.2. Design The previous iteration’s design features will be brought forward to this iteration. These include key

design aspects on the design of the library structure, as well as using a text file to store the data from

service descriptors.

It been decided to decode the service descriptor, and to save the decoded descriptor into a separate

text file. This was considered in great detail, as options such as saving the data into a variable or

dictionary were considered, as well as overwriting the encoded service descriptor text file. The reason

why the decision to create a new text file for the decrypted document was chosen over the other

options, is because, as discussed in the previous iteration, storing this amount of data in variables will

make it harder to access at a later date, and as the service descriptor is valid for 24 hours, storing it in

a text file can enable the application to skip the retrieval of the service descriptor if the application

has been shut down. This will reduce the time taken to connect to a hidden service as well as reducing

the number of packets needed to be sent and received.

This method was chosen over the overwriting of the previous descriptor, as if something were to go

wrong, such as the application not fully retrieving all the data before beginning the overwriting, the

process would have to be started again, which would be a waste of resources. An advantage of this

method would be that less memory on the client’s hard drive would be used up, but this will be

compensated for by deleting both descriptors - if the client so wishes – when the application is closed.

The method in which the data is to be extracted from the descriptor was also considered in great

detail. A dynamic method was planned, which would pick out the data between headings and extract

this, however the Tor documentation states that the descriptor document should follow a set layout

containing a set amount of data (3 Induction points). As a result of this aspect of the Tor

documentation, the decision was made to use line numbers instead of dynamically retrieving data, as

this would be less complicated to implement, errors are less likely to occur because of it, and since

the documentation stated all descriptor documents should contain the same amount of information,

there was no clear advantage of using a dynamic method over a static one.

4.4.3. Implementation To first decode the document, the text file containing the service descriptor needed to be opened,

and then have the data extracted from it. It was decided to extract the data section by section and

store this temporarily in an array list which could then be formatted before being saved to a text file

again. However, when looking at the received service descriptor it became clear that the descriptor

received did not conform to the Tor documentation, as there were only two copies of each section

rather than the expected three. This immediately became an issue for the developer as the designed

static method for retrieval was no longer appropriate; instead the dynamic method was required. This

did not impact the development of the application a great deal, as in the design section of this

iteration, both methods were considered at length and, at the time of finding the issue, no significant

code had been written.

41 | P a g e

Figure 23 - Code snippet showing how the service descriptor data is extracted

The code snippet above shows how firstly the application checks to ensure a 404 error is not received.

This is a possibility if the HSDir servers do not have the service descriptor for a given hidden service.

This informs the user of the error, which once again achieves the NFR of usability. The code then

dynamically gets the line range between each heading, this allows for only the data contained below

the headings to be extracted. However this method is not perfect, each line contains three sections of

data, the line number, data and the new line symbol. Once all the data has been extracted it must

therefore only use the required data, it is also at this point in the code that any base64 decoding can

be done to the data. These methods are shown below:

Figure 24 - Code snippet showing base 64 decoding

The above method would also have been required for the static method. Overall the change from

static to dynamic retrieval of line numbers has allowed the application to handle non-standard

descriptors, and while it was more complicated to be implemented and thus required more time to

implement, it is an improvement on the original design.

To save the data retrieved into a text document, a function was created which takes only the message

decoded from the decoding of the service descriptor function and writes this to a text file. This is a

simple operation and eliminates other variables such as the signature, as these are no longer required.

42 | P a g e

Some would argue that the decrypted document should contain all the data decoded from the service

descriptor document, however while this does make sense to do, it also uses up computational

resources for data that are no longer required.

Taking into account the non-standard service descriptors being received, the design of using static line

numbers to extract the data from the decoded service descriptor is also inappropriate. This is because

if a received service descriptor is of non-standard length, it is highly likely that the decoded descriptor

will also be of non-standard length. It is for these reasons a function which retrieves the line numbers

dynamically was implemented, this is shown below:

Figure 25 - Code snippet showing dynamic methods to extract data from a service descriptor

This code shows how, depending on how many induction point data are contained within the

document, which lines of the document are used for each variable. This ensures all IP address are

extracted correctly, as well the other data contained within the document.

Once all of the data has been extracted from the document, the data then gets decrypted in turn,

using base64. Not all data required decryption; only the induction point, onion key and service key

required this. This is then passed to the main application to enable a circuit to be created to the

induction points.

To find the induction points nickname using the IP address and ORPort provided in the decrypted

service descriptor document, a function previously created in a previous iteration was used, which

searches the consensus for a nickname based on an IP address.

A circuit to the induction points is simple to implement once the nickname has been retrieved from

the consensus, as once again to achieve the NFRs of maintainability and usability, the circuit creation

function was reused to create a circuit to the induction point.

Before a stream can be sent to the hidden service, the application must first tell a selected induction

point of the hidden service, the rendezvous point, where communication between the hidden service

and the client can take place.

To do this a relay cell, as described in a previous iteration, needs to be created. This must contain the

rendezvous point IP address, ORPort, identity, rendezvous cookie and the decrypted service key that

was found in the service descriptor. This packet creation is shown below:

Figure 26 - Code snippet showing how to connect to an induction point

As with previous extended cells, the DH keys needed to be created, and the payload was created like

a normal relay cell. One issue that was found during development was the public key for Bob. In the

original Tor documentation, this is listed as something completely different to the rendezvous

documentation, making finding out exactly what the Tor protocol was expecting very difficult. This is

just another example of how the poor Tor documentation and protocol messages hindered the

development. Despite asking this question on the official Tor development mailing list, no one was

quite sure of the answer, so 4 days were spent, trying possible public keys for Bob using trial and error

until the solution was found. The developer found that the public key for the hidden service was the

service key contained within the decoded service descriptor, this then needed to be base64 decoded

again, before being hashed with SHA1. This complex process delayed the development of the iteration,

but this was purely down to not knowing what to expect.

The decision was also made to create the v3 version of this packet, this would achieve several of the

NFRs; allowing the application to be more future proof, as because the same inputs was required, this

addition took only a few minutes to implement, and the result is an application that can use both

versions to inform an induction point of a rendezvous point.

A RELAY_COMMAND_RENDEZVOUS2 cell was expected back after the above packet had been sent, to

ensure this was received, a while true loop was implemented and, using the previously implemented

function to decrypt stream data, it was easy to look out for the relay command 37

(RELAY_COMMAND_RENDEZVOUS2). This feature took only moments to implement, and reusing

older functions meant that some of the time lost as a result of the above delay was regained.

The final requirement of this iteration was to create a stream to a hidden service. Originally, it was

thought to be a simple case of reusing functions created earlier on in the application to create streams

and send data, looking at the Tor documentation revealed that creating a stream to a hidden service

requires a different payload to be sent in the packet; because of this a new function was created. The

new function only requires an input of the port of the hidden service to be used; the payload is created

and then used in a relay cell which once again reuses previous functions.

The code to create a stream to the hidden service is below:

Figure 27- Code snippet showing the packet creation of stream to a hidden service

The competition of this function, allowed the final requirement of the iteration to be completed.

4.4.4. Testing As the client stated this will be the last iteration in which the Tor library application will be developed,

the testing methods for this iteration will be slightly different.

The standard integration, functional and white box testing will be carried out as in previous iterations,

but as this is the end of development, the application as a whole needs to be thoroughly tested; while

the tests completed below will demonstrate the functionality of the application, to determine if the

NFRs have been achieved a new testing method will be used.

Black box testing will be conducted, which means potential end users of the application test the

application. This involves a user with no previous knowledge of the application testing its functions.

To ensure this was completed to a suitable standard a brief questionnaire was sent to 30 members of

staff currently employed by the University of Portsmouth as Service Delivery Advisors, as well as

members of Computing Masters Courses at the University. The aim was to determine if any

respondents would be considered representative of the application’s target audience of experienced

application developers. The questionnaire can be found in Appendix 6.

From the results of the questionnaire, 5 members of staff were found to be appropriate. They possess

no knowledge of any of the implementations, and were given instructions and asked to briefly

summarize their thoughts and experiences on the application, shown in Appendix 7.

Overall the results were positive, with feedback being received that the comments were appropriate

and suitably detailed so that the user felt confident in knowing what each function does. The function

names were also praised for being descriptive, as this made the application feel intuitive for the user.

It was also mentioned that overall the application looked well thought-out with good use of a directory

structure when required. The feedback received from this testing method showed that this

application, in the eyes of the end users, achieved many of the NFRs as set out in iteration 1, this is a

great achievement and will go a long way to ensuring the success of the application.

The table below shows the standard testing methods undertaken for this section:

Test No. Test Test method Succeeded? Comments

67 Create a text file to store the decrypted service descriptor in

Unittest Yes

68 Save the decrypted descriptor in the text file

Unittest Yes

69 Extract the data contained within the decrypted service descriptor

Manual First test: Partial Second test: Yes

As with the previous service descriptor, this too was also found to vary in lengths, and vary the amount of data contained, this meant the original method of extracting data between specified lines no longer worked and a dynamic method was created to ensure this would not be a problem

70 Create a circuit to an induction point

Unittest Yes

71 Ensure the RELAY COMMAND RENDEZVOUS2 is received and handled correctly

Unittest Yes

72 Create a stream to the hidden service

Unittest Yes

73 Ensure a RELAY_CONNECTED cell is received

Unittest Yes

74 Handle any error messages received during the creation of a stream

Unittest Yes

Figure 28 - Testing results iteration 4

As shown in the above table, all bar one test was passed first time. This may be accounted for the fact

that the developer learned from previous iterations that trying to rush the implementation resulted

in small errors such as spelling mistakes that could have easily been avoided. During this iteration

more time was spent ensuring this would not happen again.

The test that failed was test 69, this was picked up during the implementation, where it was tested as

development occurred, and was quickly able to be fixed. This should not have failed; Tor

documentation states that the service descriptor is of a set size. However the failure of this test did

allow for an improvement in the implementation to be made, and as a result the application was made

more robust at receiving non-conventional service descriptors.

4.5. Iteration 5

4.5.1. Requirements The Tor library was now complete and had been shown to the client who was very impressed with the

overall functionality of the application, which meets all of the functional and non-functional

requirements. The client stated that this is the end of developing the application as a library.

Throughout the course of the project, during meetings with the client, he implicitly mentioned a

possible Denial of Service (DoS) attack. With the development of the project completed, and with a

small amount of time left, the decision was made by the developer to try and implement such an

attack. This is not required for the application development but would expand on its features, and

demonstrate how capable the application is.

The requirement for this iteration is just a single functional requirement - to conduct a DoS; this is a

low priority requirement, as it is unlikely to be completed, but will provide a good foundation for a

future attack and further development of the application.

4.5.2. Design A denial of service attack is an attack that aims to make a service, machine or resource unavailable

temporarily or, in some cases, indefinitely. Although many different types and methods of DoS attacks

have currently been implemented, none have been documented as being against a Tor node.

Denial-of-service attacks are considered to be violations of the Internet Architecture Board's Internet

proper use policy (Symantec, 2014), and consequently all the proposed attacks will be conducted

either on a private network where the nodes are controlled by the developer, and no other users are

using the nodes to avoid disruptions to other users, or by conducting the tests on a simulator. This

ensures no user is ever inconvenienced during the attack, thus fulfilling the ethical requirements of

the project, as there is no possibility that anyone will be negatively affected.

The design considered many methods for conducting a DoS attack, but there were two key candidates:

a bandwidth saturation attack and a memory drain attack. The bandwidth saturation attack is thought

to be an easier attack to implement as this simply requires the application to flood a large amount of

data through a the Tor circuit with the Target node(s) in until they cannot serve any other users, this

prevents access to a node, but unlike a memory drain attack, where the aim of the attack is to use up

all the RAM until the Tor process is killed by the operating system, the node is still online, and should

the attack stop, communication to the node will be continued as normal. Both attacks are capable of

achieving the goal of disrupting the availability of the node. However the bandwidth saturation attack

is capable of attacking more than a single node in an attack, whereas a memory drain attack targets a

specified node this, coupled with the ease in which a bandwidth saturation attack is expected to be

implemented, is why this form of attack was chosen over the memory drain attack.

The bandwidth saturation attack will be conducted by following the below method:

1. Attacker creates a circuit containing the target node(s), for most effectiveness and to increase

the likelihood of success, the client would create multiple circuits through the network, not

necessarily in the same circuit order.

2. The attacker would then create a stream through each circuit to a webserver containing a

large file.

3. The client would send a GET request through each circuit to the web server hosting the file

4. This causes the file to be downloaded through the Tor network using the previously created


5. The amount of data processed through the network would overload the nodes, making them

unusable for other clients

This is shown in the diagram below:

Figure 29 - Diagram showing the bandwidth saturation attack

The large file was created using the Windows command, which created a 5GB DAT file filled with

random data. The contents and the format of the file were irrelevant, but the size of the file would

impact the effectiveness of the attack. For example a 5GB file would need to be sent in more packets

than a small 200MB file.

To ensure no users were affected during the attack, a single Tor node was configured; this was

nicknamed “Goblin500”. The configuration of this node was done using the recommended settings

from the Tor foundation. It was configured using the recommended settings from Tor rather than

deliberately reducing the capability of the Tor node and making the attack simpler to achieve because

most configured relay nodes will use the recommended setting, and configuring the node to be

representative of real world settings, makes the attack more relatable to the real world as well. It

would be simple to DoS a node with a bandwidth limit of 5Kbs, but in the real world such a low

bandwidth limit would not be expected, so to do so would give the attack a false success rate.

Sends a GET request

Target Node A

Responds with the target file

Nodes will be in different orders for each circuit. Multiple streams will also be created through each one.

Target Node B

Target Node C

4.5.3. Implementation During the configuration of the target node, a setting recommended by the Tor foundation caught the

developer’s eye. Shown below is the setting in question:

Figure 30 - Configuration screenshot showing the Max bandwidth allowed per day on a Tor node

The default maximum amount of data to be received and sent is 4GB. This would mean a single stream

through the node, requesting a 5GB file, and if successfully downloaded, would mean the node would

stop receiving traffic. If the setting to reset this count was also implemented, this would enable an

attack to be scheduled for every day, ensuring that the allocated bandwidth allowance was

consistently used up.

To achieve this, a function was created that would call the DoS attack every 24 hours, allowing the

target to be attacked daily. This function is shown below:

Figure 31 - Code snippet showing a function to allow code to run every 24 hours

To once again promote good code reuse, and to prevent code duplication, the code for the DoS attack

was contained within a function which used other functions that had previously been implemented in

the application.

The code for the DoS attack is shown below:

Figure 32 - Code snippet showing the DoS attack

Here it is clear that multiple circuits have been created. The reason for implementing a feature which

created 10 circuits rather than a single circuit was that it was predicted that by requesting the file

through 10 different circuits, each containing a stream to the webserver, would make the Tor relay

process 50GB of data, which is more than enough to saturate the 100kb/s bandwidth it is configured

for. A challenge of creating 10 streams, is saving all the circuits, as each circuit is an instance; it was

51 | P a g e

important to create a circuit but also to remember the instance of the circuit in order to use the circuit

at a later date. To do this an array was created called ‘storage’ which would hold the instances of the

circuits until they were required.

The streams could have been created in the same function as the circuit creation, this would reduce

the amount of code needed to run the application as well as preventing another in range loop from

needing to be implemented, however this was not chosen. Instead it was decided to create the circuits

first and the streams second; this made it easier to understand what was going on at what time and

also reduced errors, as it was found that the streams were not getting created fully before the

application was creating another stream, thus causing errors in the program. The extra loop also

enables multiple streams to be created through a single circuit thus further maximizing the

effectiveness of the attack.

The same reasoning was applied to the sending of a GET request to start downloading the file, this too

could have been implemented within the loop creating the streams, but by keeping it separate it

allows for code that is easier to read and modify, without affecting other sections.

It can also be argued that splitting these processes up into three loops rather than a single loop makes

the attack more efficient, for example in a single loop function, the client would have to set up the

circuit, then the stream, then request data before repeating x times, however in the implemented

code above the circuits and streams would be already established, and the Tor node would be hit with

the GET requests from all these streams much faster than a single loop version, thus it can be argued

making the attack more effective.

4.5.4. Testing The testing for this iteration required a different method to be thought of, as a unittest would not be

able to establish if the attack was successful.

Instead, the target node was configured to log all bandwidth use, allowing it to be seen if the

bandwidth could be forced above the 200mb/s burst rate the node was configured for. In addition to

this, during the attack another computer with the Tor library running on would try to connect to the

Tor node, simply attempting to create a normal circuit to see if the node was still able to be connected


Unfortunately, the results from testing were inconclusive, although the network analysis tools on the

Raspberry Pi showed spikes of up to 450 KB/s, these spikes were inconsistent as the bandwidth would

often dramatically rise and fall. However, when the other machine tried to connect to the node, it was

unsuccessful. Circuits were created to other Tor nodes, but when the application tried to connect to

Goblin500 after a long period of time a destroy cell was received containing the error code X08, which

was translated to mean “OR_CONN_CLOSED”. Once the attack was stopped, the application once

again tried to connect to Goblin500 using the same circuit and this time the connection proceeded as

normal. This would indicate that the attack was successful, however the network analysis does not

definitively support this.

5. Evaluation Appendix 8 lists the functional requirements of the project along with an analysis of if they were

achieved and, in cases where they were not successfully achieved, a brief explanation of why this is

provided. The success of the requirements was evaluated based on testing results and client feedback

collected during the frequent interviews that were held.

Almost every single functional requirement for the Tor library has been achieved. Only one

requirement was unsuccessful: to create a DOS (Denial of Service) attack. This, however, does not

impact negatively on the application because the client only expected and requested, a fully

functioning Tor library to be produced – this has been achieved. Creating a DOS attack was not a core

requirement, it was merely a desirable feature of the project. Although a DOS attack was developed,

the results to assess its success are currently inconclusive. This requirement was implemented due to

having excess time, and was an ambitious requirement to implement at a late stage in the project.

Although the results are currently inconclusive, the code produced provides a great base for future

development which ties in to one of the key objectives of the project: for the application to be used

for future development.

However, improvements could still be made to the application and requirements could be expanded

upon. An example of this is the requirement to retrieve the responsible hidden service directory nodes

for a selected hidden service. This was achieved after modifying code previously written by Donncha

O’Cearbhaill (O’Cearbhaill, 2013). While this meant that the requirement was met, it might be

beneficial to the application if the developer were to develop his own code for this function at a later

stage. Improvements to the current solution can already be seen, such as taking into account hidden

services at the end of the current array, the code should loop round to the start of the array to create

a circle effect. Despite fulfilling the project requirements, the current code does not do this.

It should also be noted that, although all of the requirements were met, they were not completed

within the original timescale. This was due to the complexity of the Tor protocol hindering the

developer’s progress. Despite this, however, by putting more resources into the development all of

the functional requirements were achieved by the project deadline.

The achievements of non-functional requirements are shown in Appendix 9. The success rate of these

was less than that of the functional requirements. This is because the non-functional requirements

are harder to assess due to their frequently subjective nature. That being said, none of the NFRs failed

testing, and out of 18, only 2 came back as partially completed. This meant that although the

requirement did not fail, it had not been fully implemented.

The NFR that are considered to be only partially completed are:

Able to run on multiple platforms – Although the application is able to run on all major

operating systems, it is only able to run without installing additional libraries on Linux Ubuntu;

in order for it to be run on Windows and Mac, additional libraries need to be preinstalled

before being able to be run. Unfortunately, due to needing certain libraries this cannot be

avoided unless the functionality was developed within the application itself, which increases

the development complexity and could result in additional bugs being implemented into the


No more than 10 bugs on delivery. Currently there are no known bugs in the application, as

extensive testing has been conducted using different methods. However, this does not mean

that application is free from bugs; there could be bugs within the application that have not

yet been discovered. However, if bugs were to be discovered, these could easily be fixed. To

the best of the developer’s knowledge, at the current time the application is free from bugs.

Following testing of the application by a group of potential end-users, all of the NFRs can be considered

to have been successfully passed.

Overall, throughout this application extensive development has been carried out to ensure that all

requirements are met. Examples of this can be seen in each iteration, where requirements have been

prioritised and implemented in that order. Where requirements could not be carried out in the

intended iteration, they were carried forwards to a later iteration after confirming this was acceptable

to the client, as was the case with handling data from a destroy cell which was achieved one iteration

later than had previously been scheduled.

6. Summary, conclusion and recommendations Overall, this application can be considered a success as the objectives that were defined at the start

of this project have been met. The vast majority of the project requirements were achieved first time,

and through thorough testing bugs were able to be identified and removed from the software.

However, this project has not been without major challenges.

A key issue was the project running behind schedule as a result of the complexity of the Tor protocol

documentation and the lack of error messages provided by the Tor network. This was at times

frustrating and can be said to have hindered development. As an example, Bob’s hidden service public

key was required for one packet. However, in the documentation it did not state what this public key

was, and even developers on the Tor mailing list were unsure as several different answers were

received, all of which turned out to be incorrect (Dennis, 2014). This made the project more

complicated than had previously been anticipated.

By comparing the original project development plan in Appendix 3 to the actual project development

plan in Appendix 10, the delays experienced by the developer are evident. However, careful project

management and increasing the amount of time dedicated to the project ensures that the developer

was able to complete the application on time and fulfilled the client's requirements.

Another unforeseen issue was the amount of time required for the developer to learn Python to a

suitable level. Originally only a week had been allowed for this, but more time was required, which

again impacted on the intended time schedule.

The Tor library created as part of this project provides a strong base for future development, thus

achieving one of the key objectives outlined at the start of this project. For example, it could provide

a foundation for attacks against Tor to be created as, unlike other Tor libraries, the application

connects directly to the Tor network using the Tor protocol and does not simply manipulate the Tor

client. Further study on the DOS attacks would be recommended, as well as using this application to

implement many of the attacks based in literature which have not currently been proven, such as the

Sniper attack (Jansen, 2014).

It is hoped that further developing this application could have far reaching consequences: it could be

used by law enforcement to de-anonymise users of hidden services, such as paedophiles, drug dealers

and those involved in the illegal arms trade.

The developer’s personal opinion of the project is that this has been an extremely challenging project,

which turned out to be much more complex than had been anticipated. Directly communicating with

the Tor network presented challenges, and the Tor protocol documentation was found to be, at times,

more of a hindrance than a help. In addition, there is not currently a very strong community of Tor

developers, so when trying to clarify the Tor documentation it was not always possible to ask for help

from fellow developers.

8. Appendixes

Appendix 1 – Project Initialization document

School of Computing

Postgraduate Programme

MSc in Computer and Information

Security Project Specification

Richard Dennis

Project Specification Richard Dennis

Project Specification

Basic details

Student name: Richard Dennis

Draft project title: Development of a functional Tor library

Course and year: Computer and Information Security 2014

Client organisation: University of Portsmouth

Client contact name: Dr. Gareth Owen

Project supervisor: Dr. Gareth Owen

Outline of the project environment

The client of this project is Dr Gareth Owen of the University of Portsmouth. They are course leader of the

undergraduate course; Digital Forensics and teaches digital forensics, cryptography and other units. They

require this project to be undertaken as there is a clear lack of Tor libraries that can be used for development,

the client wishes for this project to fulfil this issue, and they would like to use the application developed from

this project as a foundation to future developments to conduct attacks such as de-anonymization attacks on

hidden services and clients. It is therefore needed to offer the same level of functionality as the Tor client

The problem to be solved

This project aims to create an application that can be used as a library to enable future development on the

Tor protocol. There are currently libraries available that use allow development of Tor, however these require

Tor to be installed first and only control the Tor client, there is currently not an application that can

communicate directly to the Tor network, to send and received specific packets.

The aims of this project are:

To create a Tor library

Create a circuit through Tor

Connect to a hidden service

Perform an attack on the Tor network

There are however many constraints on a project such as this, these include:

Hard deadline – 12th September 2014. On this day all project deliverables must be handed in to the client.

Zero budget

Availability of software – Although the software used will be open source due to the budget constraint, support or access to the software may be limited.

Knowledge of the problem is limited, and in order to gain the necessary understanding of Tor, as well as learning new skills required for the project, will require a large amount of time.

Other commitments will interfere with the project; good project and time management are required.

Project Specification Richard Dennis

Breakdown of tasks

This project will rely heavily on research, for the Tor library to be completed research into the Tor protocol

will need to be completed. This will be done by reading through the Tor specification and Documentation. It

will also mean looking at existing Tor libraries, to see how they are doing this and how it can be improved

upon. It is not just research into the Tor protocol that will be needed, research into literature will also need to

be conducted in order to get a better understanding of how the attacks work, and could be implemented.

Many skills will be required to complete these project, the main one will be a thorough understanding of the

selecting programming language. If the skills in this area is not at a suitable level, then it will be required to

quickly learn the language up to a level which is acceptable, this will be done through online tutorials, reading

the programing language documentation and finally by programing small tasks getting bigger and harder.

The project is relatively light in the design and build sections, there will be no additional software that needs

to be purchased, and since this project has a budget of zero, all required libraries or additional software will

need to be open source.

The design section will need to focus on the layout of the library, focusing on usability, as there will be no

requirement to implement a GUI. Usability will be a key design decision and crucial to the success of the

application, as its intended end use is for other developers to use and improve upon.

The implementation of this application will only require a computer with an internet connection, as well as

the appropriate programming language software such as an IDE.

Project deliverables

A fully functioning and tested Tor library, along with a detailed READ ME file will be produced.

Accompanying the artefact will be the project report this will comprise of the following sections:


Literature review

Requirements of software

Design of the artefact

Implementation of the artefact

Testing & Evaluation

Future recommendations




There will be a comparison between all appropriate requirement elicitation methods currently available, the

most suitable for this project will then be applied. The requirements will not only be from the client, to ensure

this application also meets the needs of the end user, they will also be a major part of the elicitation of the


Page: 62

Legal, ethical, professional, social issues

There should be no legal, ethical, professional or social issues with this project. All software used will be open

source and should any code be required to be used from other projects this will only happen once the written

permission from the owners of the code is granted.

Any attacks on the Tor network will be conducted either on a simulator or on a closed Tor network, where the

developer controls all the nodes, this will ensure that no users are effected during the attacks, and none of

their personal data and details are ever exposed to the attacks.

There will be no need to gain research ethics approval. However the supervisor of this project will be kept in

up to speed about the project development ensuring if any approvals were required, then it would be simple

to get.

Facilities and resources

Only a computer with internet access and the programming language developer environment installed on it

will be required, there will be no constraints on their availability and as the budget for this project is zero, no

additional software of hardware can be purchased. The library facilities will be required when conducting

background research and research into the literature review, although this can be done online, and the library

is open 7 days a week. The Tor foundation also provides a good facility listing all relevant literature on Tor, and

access to this will be required.

Project plan A gnat chart showing the estimated development of the application, including the development of the

application as well as the report

Task May June July August Steptember

7th 14th 21st 28th 4th 11th 18th 25th 2nd 9th 16th 23rd 30th 6th 13th 20th 27th 3rd 12th

Project Initialization Document

Ethical document

Research literature

Learn programming language



Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Development overrun


Check Formatting and proof reading


As with any project there is risk involved; below shows the potential risk and the strategies in place to

reduce these.

Risk Impact level Reduction strategy

Tor network unavailable – Internet

access or the Tor network


High Use a Tor simulator such as the Tor

Path Simulator (TorPS).

Poor productivity – Developer’s

motivation inhibits the project’s


High Set 20 hours a week minimum for

the project, more when needed.

Setting small milestones will

increase motivation and

productivity. Regular meetings

with the client will ensure the

milestones are met.

Technical risk – Project is too

complex to implement

High Regular meetings with the client

ensuring they are kept up to date

with the development, and adjust

the requirements to allow for a

work around if possible.

Programmatic risk – Customer

changes their mind about wanting

the project developed

High Find another client or adapt the

project to cater for the clients

change of heart.

Inherent schedule flaws – due to the

uniqueness of the project, it is

difficult to estimate and schedule.

Medium Better to overestimate than

underestimate timescales; use the

Agile methodology to renegotiate

the schedule with client.

Requirements Inflation - more

features that were not identified at

the beginning of the project emerge

that threaten estimates and


Medium Keep in constant contact with the

client with regular meetings etc.,

only accept more features if

timescale allows.

Specification Breakdown – Only

during the development does a

conflicting requirement become


Medium Contact the client, work out a

solution that would have the

lowest impact.

Insufficient resources – Unable to

develop the project due to not

having access to a required resource.


See if the resource is really

required, look for ways to reduce

resource use previously in the

project, and try to gain the

required resource.

Incorrect budget estimation –

Overall cost of the project starts to

increase and spiral


There is a budget of zero for this

project, to maintain this open

source software and libraries will

be used.

Registration mode Full Time

Full Time Project mode

Planned submission deadline September 2014

Signatures Signature: Date:



Project supervisor

Appendix 2 – Ethical checklist

PJE40 and PJS40

Ethical Examination

Undergraduate Final Year Projects

School of Computing

Faculty of Technology

This document describes the issues that you need to consider before you start your investigations.

This is particularly important where your work may involve other people (human subjects) for the

collection of information as part of your project work.

The examination takes the form of a checklist of 12 questions. Each question has come guidance


Consider each question is turn and check the box for Yes or No.

You are then asked to write a short entry explaining the reason for your reply.

For example:

6. Are you in a position of authority or influence over any of the human subjects in your study?


Although all the human subjects will be staff members at the

University of Portsmouth, none of them are in my own department or

area and none are subordinate to me in a management structure.

Yes No


Therefore I can see no way that I could have undue influence over

them They could take part completely voluntarily.

If a grey box is ticked then your project ideas need to be looked at more closely, and you MUST discuss

this matter with your project Supervisor.

The final sections deal with Information Sheet(s) and Informed Consent, and you must attach any

extra documents concerning these (where relevant) to this Ethical Examination at time of the

submission of your Initial Report.

Ethics Information: 12-point Checklist

1. Will the human subjects be exposed to any risks greater than those encountered in their normal lifestyle?

For example: could the study induce psychological stress or anxiety; is more

than mild discomfort or pain likely to result from the study; will the study

involve prolonged or repetitive activities?

Investigators have a responsibility to protect human subjects from physical

and mental harm during the investigation. The risk of harm must be deemed

to be no greater than in their normal lifestyles.

Comments: No harmful activities

Yes No

2. Will the human subjects be exposed to any non-standard hardware or non-validated instruments?

Human subjects should not be exposed to any risks associated with the use of

non-standard equipment: anything other than pen-and-paper, or typical

interactions with desktop, laptop PC’s, tablet PC’s, PDA’s or mobile phones are

considered non-standard (for example, using a VR room) nor should they be

subjected to non-validated instruments e.g. unscrutinised questionnaires.

Comments: Only a computer with internet access will be user

Yes No

3. Will the human subjects voluntarily give consent?

If the results of an evaluation (for example) are likely to be used beyond the

term of the project (for example, software is to be deployed or data is to be

published), then signed consent is necessary. A separate consent form should

be signed by each human subject. Return of a consent email can constitute

written consent if this has been made clear to the human subject.

Yes No




Otherwise verbal consent is sufficient and should be explicitly requested in the

introductory script (Information Sheet).

Comments: Consent requested verbally if needed

4. Will any financial, or other, inducements (other than reasonable expenses and compensation for time) be offered to human subjects?

The payment of human subjects must not be used to coerce them against their

better judgement, or to induce them to risk harm beyond that which they risk

without payment in their normal lifestyle.

Comments: Any participants in the project will be asked to volunteer, they will

not be paid using any method.

Yes No

5. Does the study involve human subjects who are unable to give informed consent (for example: children under 18, people with learning disabilities, unconscious patients).

Parental consent is required for human subjects under the age of 18. Additional

consent is required for human subjects with impairments, and people assessed to

be lacking in mental capacity. If consent is gained from a person other than the

human subject themselves e.g. a parent, then written consent must be obtained.

Comments: No subjects unable to give informed consent asked, this project is

not likely to require any subject unable to give informed consent.

Yes No

6. Are you in a position of authority or influence over any of your human subjects?

A person in a position of authority or influence over any human subject must

not be allowed to pressurize them to take part in, or remain in, any study.

Comments: No subjects I am in charge or influence of asked

Yes No




7. Are the human subjects being provided with sufficient details of the study at an appropriate level of understanding?

All human subjects should be able to understand the information provided in

any documentation and/or verbal information they receive about the

experiment or study. They have the right to withdraw at any time during the

investigation, and they must be able to contact the investigator after the

investigation. They should be given the details of both student and supervisor

as part of the debriefing. This information should be in the introductory script

(Information Sheet).

Comments: Any participants given information about study

Yes No

8. After the study, will human subjects be provided with feedback about their involvement and be able to ask any questions they may have about this involvement?

If the human subjects request further information, the investigator must provide

the human subjects with sufficient details to enable them to understand the nature

of the investigation and their part in it.

Comments: Feedback form and time for any questions afterwards

Yes No

9. Will the human subjects be informed of the true aims and objectives of the study?

Withholding information or misleading human subjects is unacceptable if human

subjects are likely to object or show unease when debriefed. It must be clear to

Yes No




human subjects if information is being withheld in order to elicit a true response.

This should precede any analysis of the data.

Comments: All information about study to be presented before

10. Will the data collected from the human subjects be made available to others (where appropriate and only in relation to this research study), and be stored, in an anonymous form?

All human subject data (hard-copy and soft-copy) should both be stored securely

and, if appropriate made available, in an anonymous form. Making human

subject data available to a third party may be relevant where a student is taking

part in a wider research project eg. for a member of the University staff, in which

case anonymity of human subject data must be preserved.

Comments: No subject names recorded

Yes No

11. Will the study involve NHS patients, staff, or premises?

If yes, then an application must be made to the appropriate external NHS Local

Research Ethics Committee (LREC). For projects other than postgraduate

research studies, the length of time for gaining external approval may not fit

into a project timescale.

Comments: No NHS items needed

Yes No

12. Will the study involve the investigator and/or any human subject, in activities that could be considered contentious, morally unacceptable, or illegal?

Yes No




If yes, then further approval must be sought. For example: a project involving the

study of pornography on the web will fall into this category. It is possible that the

project may not be allowed to proceed.

Comments: No morally unacceptable or illegal activities involved

Please attach the following:

Any Information Sheet(s) or introductory script(s) that the investigator has created for the benefit of the human subjects in the study. (See for examples of Information Sheets that set out details of a research study for human subjects).

Any documentation that the investigator has created to gather informed consent from the human subjects. This may be an Informed Consent Form, or a form of wording used to get verbal consent. (See for an example of an Informed Consent Form for research study with human subjects).


By signing this form, I AGREE to abide by the decisions made in the above points.

If at any time during my project, my answers would change from a white box to a grey box,

then I MUST seek re-approval for my project. I understand that if I do not do so, then it is

possible that I may FAIL the project component of my course.

Student name: …….………………………………… Jupiter number: ………………

Student signature: ………………………………….. Date ……………………………


Supervisor signature: ………………………………. Date ……………………………..

Appendix 3 – Original planned project schedule

Task May June July August September

7 14 21 28 4 11 18 25 2 9 16 23 30 6 13 20 27 3 12

Project Initialization Document

Ethical document

Research literature

Learn programming language



Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Development overrun


Check Formatting and proof reading Dissertation

Appendix 4 – Project schedule updated due to iteration 1 overrunning

Task May June July August Steptember

7 14 21 28 4 11 18 25 2 9 16 23 30 6 13 20 27 3 12

Project Initialization Document

Ethical document

Research literature

Learn programming language



Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Development overrun


Check Formatting and proof reading Dissertation

Appendix 5 – Project schedule updated due to iteration 2 overrunning

Task May June July August September

7 14 21 28 4 11 18 25 2 9 16 23 30 6 13 20 27 3 12

Project Initialization Document

Ethical document

Research literature

Learn programming language



Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Development overrun


Check Formatting and proof reading Dissertation

Appendix 6 – Suitability analysis questionnaire

Appendix 7 – Black box testing questionnaire. Were you able to run the application?

Do you have any pre-existing knowledge of the implementation of the library you have been asked to


Were all required libraries pre-installed?

If 'No' to above, what libraries were required to be installed, and were they able to be installed?

What environment will you will be testing the application in? (O.S. Etc.)

What are you first impressions of the application?

How was the directory structure of the application?

Were the comments on each function adequate to understand what was expected of the function?

Were the names of the functions easy to understand and relatable to the functions they described?

Was the output displayed to the user suitable for this type of application?

Did you understand what the application was doing?

Did you face any error messages?

If you faced error messages, were they appropriate and did they provide help to finding the cause of

the error?

Would you feel confident in using a function, supplying it with the correct information?

Would use the library to develop your application if you was in need of its features

Would you develop this application to provide greater functionality, if you was in a need to do so

Overall how would you rate the actual functionality of the application, compared to the supposed


Finally, would you use this application again?

Appendix 8 – Evaluating functional requirements Requirement Achieved? Comments

Connect to the Tor network Yes Achieved without requiring the Tor client to be installed

Send and receive a version cell Yes

Decode NetInfo cell to extract data from it


Handle errors from destroy cells Yes This was implemented at a later iteration than first planned

Create a circuit through the Tor Network


Create a circuit of any length Yes

Create a circuit using specified nodes Yes

Create a stream through Tor to a web server


Create multiple streams through Tor to a web server


Able to retrieve webpages from an internet web server through Tor


Retrieve the three responsible HSDirs nodes for a specified hidden service

Yes Used a modified function from Donncha O' Cearbhaill to enable this to be completed

Retrieve the service descriptor for the selected hidden service


Connect to a selected rendezvous point


Create a circuit to a HSDir server responsible for the selected hidden service


Download and save the service descriptor


Decrypt the service descriptor Yes

Extract data from the decrypted document


Select an induction point and create a circuit to it


Create a stream to the hidden service


Create a DOS (Denial of service) attack to incapacitate a chosen node.

Partially A DoS attack was designed and implemented although the results as to its effectiveness are inconclusive, more research is required on this attack to properly evaluate its success.

Appendix 9 - Evaluating Non-Functional requirements Quality Characteristics

Requirement Achieved? Comments

Portability Able to run the application without installing Tor


Able to run on multiple platforms (Windows, Mac, Linux)


On some operating systems, libraries the application depend on will need installing. Python also is required to be installed.

Not require the application to be installed to run


Reliability No more than 10 bugs on delivery Somewhat There are currently no know bugs in the application, but this does not mean there is not any bug in the software.

Efficiency Use as little computational resources as possible such as RAM. (No more than a 1gb of RAM)


Usability No GUI Yes

Precise and constructive error messages Yes

Documentation Yes

Universal naming standard Yes

Dependability Able to operate normally or abnormally without threat to life or environment


Legal Only use open source software Yes

Maintainability Able to expand the system to incorporate new features, fix defects or deal with new technology.


Adaptability Able to change the system to handle additional domain concepts


Appendix 10 – Actual project schedule

Task May June July August September

7 14 21 28 4 11 18 25 2 9 16 23 30 6 13 20 27 3 12

Project Initialization Document

Ethical document

Research literature

Learn programming language



Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Development overrun


Check Formatting and proof reading Dissertation