video synchronization in the cloud using visible light communication
Transcript of video synchronization in the cloud using visible light communication
VIDEO SYNCHRONIZATION IN THE
CLOUD USING VISIBLE LIGHT
COMMUNICATION
Maziar Mehrabi
Master of Science ThesisSupervisor: Dr. Sébastien Lafond
Instructor: Le WangEmbedded Systems Laboratory
Faculty of Science and EngineeringÅbo Akademi University
January 2015
ABSTRACT
Video synchronization refers to time-based alignment of several audio/video streams.
The growth of heterogeneous social media networks demands faster and more efficient
synchronization methods that could satisfy the real-time requirements of media cloud.
Although there are many techniques and methods for synchronization that have been
in use or proposed, this thesis suggests a novel approach for video synchronization by
harnessing the capabilities of Visible Light Communication (VLC) as it can provide
more robust and efficient ways of video synchronization.
This thesis introduces the design and implementation of a VLC-based video syn-
chronization prototype. The synchronization of different video streams is provided
by the means of VLC through Light Emitting Diode (LED) lights and digital phone
cameras. This is achieved by embedding the necessary information as light patterns
in the video content. These patterns can later be recognized by processing the video
streams. In addition to synchronization, the transmitted information through VLC can
make many other applications available.
This method of synchronization is needed in cases where several heterogeneous
camera-equipped devices (e.g. cellular smart phones) are live-streaming video con-
tents to a media server or cloud environment. In addition, the means of VLC can
be exploited to carry information for other purposes rather than video synchroniza-
tion. The approach presented in this work does not require modification of software or
hardware components of the camera device.
Keywords: LED, Digital Camera, Rolling Shutter, Video Stream, Video Frame, Mi-
crocontroller, Carrier Frequency
i
CONTENTS
Abstract i
Contents ii
List of Figures iv
Glossary vi
1 Introduction 1
1.1 Objective of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background and Related Work 4
2.1 Synchronization of Social Video Streams . . . . . . . . . . . . . . . 42.2 Previous and Related Work . . . . . . . . . . . . . . . . . . . . . . . 72.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 VLC-based Video Synchronization 10
3.1 Visible Light Communication . . . . . . . . . . . . . . . . . . . . . . 103.1.1 The Rolling Shutter Effect . . . . . . . . . . . . . . . . . . . 12
3.2 A Video Synchronization System . . . . . . . . . . . . . . . . . . . . 153.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Design and Implementation 21
4.1 Physical Layer and Devices . . . . . . . . . . . . . . . . . . . . . . . 224.1.1 Electronic Components . . . . . . . . . . . . . . . . . . . . . 224.1.2 Circuit Isolation . . . . . . . . . . . . . . . . . . . . . . . . 244.1.3 Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 VLC Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2.1 Modulation Techniques . . . . . . . . . . . . . . . . . . . . . 274.2.2 Bandwidth Limitations . . . . . . . . . . . . . . . . . . . . . 294.2.3 Frequency Selection . . . . . . . . . . . . . . . . . . . . . . 334.2.4 Lifetime of Frequencies . . . . . . . . . . . . . . . . . . . . 354.2.5 Flicker Improvement . . . . . . . . . . . . . . . . . . . . . . 37
ii
4.3 VLC Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3.1 Architectural Overview . . . . . . . . . . . . . . . . . . . . . 404.3.2 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . 434.3.3 Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . 44
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5 Results and Evaluation 50
5.1 Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.2 Evaluation Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 525.3 GPU Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.4 Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6 Conclusion and future work 58
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Bibliography 65
A Appendix 78
A.1 Direct Communication With The Arduino Board . . . . . . . . . . . 78A.2 Interfacing The Serial Connection . . . . . . . . . . . . . . . . . . . 81A.3 Character Lookup Table . . . . . . . . . . . . . . . . . . . . . . . . . 83
Appendix A 78
iii
LIST OF FIGURES
2.1 Use case scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Unsynchronized video streams . . . . . . . . . . . . . . . . . . . . . 62.3 Requirements of synchronization with reference clock . . . . . . . . . 72.4 Synchronization using audio patterns . . . . . . . . . . . . . . . . . . 8
3.1 Data transmission through visible light. . . . . . . . . . . . . . . . . 113.2 The rolling shutter effect on fast moving train. . . . . . . . . . . . . . 133.3 The rolling shutter effect on blinking LEDs . . . . . . . . . . . . . . 143.4 Fast blinking light sources captured by the rolling shutter . . . . . . . 143.5 Architecture of the video synchronization system . . . . . . . . . . . 163.6 Inter-frame synchronization . . . . . . . . . . . . . . . . . . . . . . . 173.7 Closed GOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.8 Open GOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1 SMD LED plate of IKEA Ledare E27. . . . . . . . . . . . . . . . . . 234.2 Circuit schematic for modulating low power LED. . . . . . . . . . . . 234.3 Darlington Pair implemented in TIP120 ICs. . . . . . . . . . . . . . . 244.4 Optocoupler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.5 Circuit schematic for the transmitter. . . . . . . . . . . . . . . . . . . 254.6 Hardware components of the prototype. . . . . . . . . . . . . . . . . 264.7 Modulation techniques . . . . . . . . . . . . . . . . . . . . . . . . . 284.8 Block-based video compression effect . . . . . . . . . . . . . . . . . 314.9 FT of the frame shown in Figure 3.4 . . . . . . . . . . . . . . . . . . 324.10 Block-based motion estimation effect . . . . . . . . . . . . . . . . . 334.11 Guard intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.12 Odd harmonic frequencies in a square wave. . . . . . . . . . . . . . . 354.13 Breaking of a frequency in two frames. . . . . . . . . . . . . . . . . . 374.14 Flicker compensation using duty-cycle modification. . . . . . . . . . 394.15 Receiving server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.16 Picture conversion and image processing. . . . . . . . . . . . . . . . 424.17 Finding the threshold value. . . . . . . . . . . . . . . . . . . . . . . . 444.18 A frame carrying three frequencies. . . . . . . . . . . . . . . . . . . 454.19 Time to frequency conversion. . . . . . . . . . . . . . . . . . . . . . 464.20 Spectrum centralization. . . . . . . . . . . . . . . . . . . . . . . . . 47
iv
4.21 DFT conversion of the picture shown in Figure 4.18. . . . . . . . . . 48
5.1 Test scenario for the video synchronizer . . . . . . . . . . . . . . . . 515.2 Unsynchronized video streams on playback . . . . . . . . . . . . . . 525.3 Synchronized video streams on playback . . . . . . . . . . . . . . . . 535.4 Performance profiling results . . . . . . . . . . . . . . . . . . . . . . 545.5 Sequence of frames with maximum one VLC frame . . . . . . . . . . 56
6.1 Adding sinusoid frequencies . . . . . . . . . . . . . . . . . . . . . . 606.2 Stitching images - Sliding window. . . . . . . . . . . . . . . . . . . . 616.3 Processing times in pipelined and non-pipelined task scheduling . . . 62
v
GLOSSARY
• 3G
Third Generation.
• 4G
Fourth Generation.
• BJT
Bipolar Junction Transistor.
• BT
Bluetooth.
• CCD
charge-coupled device.
• CFL
Compact fluorescent lamp.
• CMOS
complementary metal–oxide–semiconductor.
• DFT
Discrete Fourier Transform.
• EU
European Union.
• FEC
Forward Error Correction.
• FFT
Fast Fourier Transform.
vi
• FPS
Frames Per Second.
• FSK
Frequency Shift Keying.
• IC
Integrated Circuit.
• IoT
Internet of Things.
• IR
Infrared.
• ISI
Inter-Symbol Interference.
• LAN
Local Area Network.
• LED
Light Emitting Diode.
• MOSFET
Metal-Oxide-Semiconductor Field-Effect Transistor.
• OFDM
Orthogonal Frequency Division Multiplexing.
• QR code
Quick Response Code.
• SMD
Surface Mount Device.
• SNR
Signal to Noise Ratio.
• THD
Total Harmonic Distortion.
vii
• UV
Ultraviolet.
• VLC
Visible Light Communication.
• Wi-Fi
Wireless Local Area Network.
viii
1 INTRODUCTION
In recent years the number of hand-held devices with multimedia capabilities has in-
creased. In addition, these devices usually have abilities to share and access video
contents via the Internet [1]. This allows common mobile phone users to be able to
generate and distribute high-quality content [2] such as images, video, etc.. Studies in
[3] and [4] have forecast a dramatic growth in both the number of connected devices
and the share of video content in global consumer Internet traffic. Moreover, the
increasing popularity of social networking, media sharing and also network-enabled
cameras (i.e. IP camera [5]) leads to a situation in which many video streams are
available for a particular live event.
One of the challenges of maintaining a live media streaming service is to keep
multiple video streams synchronized [6]. The synchronization problem happens as the
video streams are distributed through different network infrastructures (e.g. 3G, 4G,
Wi-Fi, LAN, etc. or any combination of these [7]) with different characteristics (such
as jitter, delay, speed, etc.), hence each video stream might be exposed to a certain
value of delay [8] [9] resulting in unsynchronized video streams at the destination. The
other main reason for unsynchronized video streams -regardless of network facilities-
is the starting point of each recoding camera that leads to different timestamps for
identical frames among several video streams. One way to achieve video synchroniz-
ation is to make use of visual information available in the video [10] . In this work,
features of visible light is utilized to provide the necessary visual information for the
synchronization.
Visible Light Communication (VLC) refers to wireless communication using the
visible light spectrum i.e. wavelengths from 380nm (violet) to 780nm (red) [11]. One
of the main advantages of VLC is its ability to be combined with existing lightening
sources in our environment, making it efficient and suitable for ubiquitous comput-
ing applications. VLC can be utilized in transportation systems, machine-to-machine
communication [12], underwater communication [13] and so on. Unlike other radio
1
technologies, light cannot travel through non-transparent material. This feature makes
VLC an ideal solution for e.g. indoor wireless communication [14] [12].
The energy efficiency feature of LEDs compared to other lighting technologies
such as Compact Fluorescent Light (CFL) or incandescent light bulbs [15] makes it
the next dominant solution for lighting industry [16] [17]. LED’s share of the world
market size has been increasing for the few past years and it is predicted to be more
than 30% of total lighting market in 2016 [18] [19] [20]. Furthermore, as LED light
modules can be driven by DC power, their ability to be rapidly modulated makes them a
suitable candidate for smart lighting [21], smart spaces and cities [22] [23], Internet of
Things (IoT) and VLC applications. The combination of LEDs and data transmission
makes VLC an emerging technology and an interesting research topic.
Moreover, using ultraviolet and infrared technologies are considered to be out-
dated for many applications as newly manufactured devices equipped with digital
camera (such as smart phones) are provided with Ultra-Violet (UV) and infrared (IR)
blocking filters [24] [25] [26]. Also, as explained in chapter 8 of [27], characteristics
such as distance impact and noise vulnerability are relatively the same between visible
light and infrared (IR) communication technologies.
1.1 Objective of the thesis
The main objective of this thesis is to design, implement and evaluate a VLC-based
video synchronization system. In order to make VLC available, it is necessary to
implement a platform that can be utilized to build VLC-based applications. In this case
a real-time video synchronization system is implemented to exploit the capabilities of
such platform. Additionally, the VLC platform could be utilized for other VLC-based
use cases as well. This platform is designed and implemented while considering the
scalability and isolation characteristics of cloud computing.
This thesis explains the implementation details of the mentioned platform. This
platform consists of hardware and software components to transmit information through
physical light bulb modules. In addition, camera-equipped mobile phones are used to
generate video contents. The VLC receiver and video synchronization software com-
ponents of this platform are implemented in an isolated and scalable manner.In this
way, scalability and transparency features of cloud environment can be harnessed for
application deployment.
2
This thesis also explains the challenges and constraints of implementing a camera-
based VLC system. Additionally, solutions, instructions and suggestions are also
presented in this thesis in order to overcome the mentioned challenges.
1.2 Thesis structure
The thesis is divided into six chapters. Chapter 2 explains the background and previ-
ously performed works related to VLC and video synchronization. Chapter 3 describes
the proposed architecture for the video synchronization in details. Chapter 4 describes
the implementation details of all elements of the prototype namely electronic compon-
ents, transmitter software and a receiving server. Evaluation results are presented in
Chapter 5. Chapter 6 presents the conclusions and future work followed by a discus-
sion about the future topics of lighting and VLC systems.
3
2 BACKGROUND AND RELATED WORK
This chapter first states the problem of video synchronization and then surveys related
works and prior arts in video synchronization. Traditional synchronization methods
mainly focus on synchronization of audio and video stream (lip synchronization) [28]
or synchronization of separate video streams in a centralized system. However, these
solutions are not usually suitable for a decentralized heterogeneous system.
2.1 Synchronization of Social Video Streams
It is necessary for a multi-camera system to maintain the synchronization of separ-
ate video streams. The need for video synchronization can be because of a variety of
reasons such as analysing the visual information [10], identification, activity recogni-
tion and also live broadcasting of sport events [29]. In addition, providing time-based
alignment becomes more challenging in a non-centralized multi-camera system [30]
that is consisting of heterogeneous camera-equipped smart phones. Whitehead, A. et
al. define the video synchronization problem as the following: "Given k different video
sequences that overlap in time, identify one frame from each of the different sequences
that refer to the same point in (universal) time." [31].
Figure 2.1 shows a use case in which a number of individuals such as audience,
skycams and camera operators (called producers) are streaming live video content to
a cloud-based media server. The video streams are then tailored (e.g. transcoded) ac-
cording to the next component’s needs in software containers and sent to the director.
The director has the responsibility of selecting the optimal video stream based on cer-
tain constraints (e.g. video quality, best viewpoint, etc.) and re-streaming the selected
video stream to a broadcasting server. Therefore, it is necessary for the director to have
the video streams synchronized in order to make a fair decision.
4
Scene/Arena/Object
Producers
3G
4G
Wi-Fi
Containers
Director
Broadcaster
Figure 2.1: Use case scenario
The reason why video streams are delivered unsynchronized falls into several cat-
egories such as the impact of heterogeneity of network infrastructures, different start
time for each camera and frame rate variations. Figure 2.2 shows a scenario where two
different types of camera have started recording an event in different times and from
different viewpoints. In the meantime the recorded media streams are being streamed
to a media gateway through different network connections. However, when the streams
are arriving at the media gateway there is a time offset (shown as ∆t) among identical
frames in different video streams. In the example shown in Figure 2.2 ∆t has the
minimum value of 7 seconds (neglecting networks impact). The solution for the syn-
chronization problem is to make the early video stream delayed for ∆t. However,
finding the ∆t is the challenge that broadcasting services are struggling with.
5
Δt
Record time: 00:02:36
Record time: 00:02:43
LAN
4G
Current playback time
Video
Stream 1
Video
Stream 2
Media Gateway
Live
Video
Streams
Figure 2.2: Unsynchronized video streams
6
2.2 Previous and Related Work
As explained in Chapter 7 of [32], the synchronization problem imposed by the net-
works could be solved by defining a global clock (using e.g. methods in Network Time
Protocol [33]) as a reference for all cameras. This solution is shown in Figure 2.3.
Reference clock
Video timeline
Video buffer Network packet
Frame
available
Reference clock
Clock
synchronization
protocol
Sister protocol packets
Video buffer Network packet
Sister protocol packets
Frame
available
Δt
Video timeline
Figure 2.3: Requirements of synchronization with reference clock
Figure 2.3 shows a setting where two different cameras are streaming video content
over the network. These cameras stamp the video content (i.e. each frame) by referring
to a reference clock. Moreover, each network packet gets ready after the packet’s
header is updated with information on current time. The reference clocks of each
camera are being synchronized using a synchronization protocol. Hence any delay
caused by camera’s internal characteristics or network jitter can be detected in the
receiver using the globally synchronized timestamps. If there are any sister protocols
in use, the reference clock can be used to form the necessary packets. For example
the Real-time Transport Control Protocol (RTCP) is a sister protocol of the Real-time
Transport Protocol (RTP).
However, the solution shown in Figure 2.3 requires the system to be centralized as
each camera needs to communicate with a central server in order to make itself aware
7
of the global consensus on time. Therefore it requires the camera-equipped device
to have this feature implemented. On the contrary, smart phones are tend to have
independent implementations of their video recording software in which the starting
timestamp for every separate video is set to zero.
Another approach to achieve synchronization is through visual information. Many
of the related prior works rely on homograph pictures which requires the cameras to
be very close to each other (to ignore the parallax effect) and in a static position [10]
[34] [35] [36] [37] and/or limited to number of cameras leading to scalability and
mobility limitations [38]. These constraints limit harnessing the scalability features of
cloud computing. A novel approach in many of the mentioned methods is to identify
and track a feature or a moving object in the scene [39] [40] [41]. Nonetheless these
methods require the object to be in line of sight of all cameras.
Researchers in [42] and [43] have been developing a synchronization method that
uses the patterns in audio channels as a reference to find identical frames. The picture
shown in Figure 2.4 illustrates this method.
Audio track in
Stream 1
Audio track in
Stream 2
Audio track in
Stream 3
Selected audio pattern
Time
Figure 2.4: Synchronization using audio patterns
8
Figure 2.4 depicts three audio tracks where each of them belong to a different video
stream recording the same event. Although these audio streams are not exactly the
same but they often can contain similar patterns. These patterns (shown in rectangular
shapes in Figure 2.4) can be for example a high pitch tune or a specific announced
word. Once the synchronizer detects any common pattern between these audio stream
it can determine the conveyed delay and align the video frames corresponding to the
audio pattern. The advantage of this method over other pixel based methods is that
the audio information usually contain much less data to process and the algorithms are
relatively more straightforward to implement.
However, the drawback of this method is that the same audio footprint must be
present in all video channels. It is challenging to satisfy this requirement in big and
crowded events (e.g. a football match in a stadium), mute spaces and audio edited (i.e.
remixed/remuxed) channels. Furthermore, this methods is not suitable for real-time
video streaming because the audio tracks should be buffered enough to provide the
necessary data for pattern recognition.
2.3 Summary
This chapter explained the problem of video synchronization in social video streaming
environments. The need for video synchronization has attracted a lot of research effort.
However, traditional methods of synchronization are not always suitable for today’s ex-
pansion of social networking due to its decentralized and heterogeneous characterist-
ics. Computation costs and low system accuracy are also other drawbacks of traditional
synchronization techniques. In the following chapter we will explain how inter-frame
video synchronization can be done by using VLC.
9
3 VLC-BASED VIDEO SYNCHRONIZATION
This chapter introduces Visible Light Communication (VLC) systems and their ap-
plications and provides a brief description of the proposed architecture for the video
synchronization system. More details on the implementation of the transmitter and
receiver are given in Chapter 4. In this thesis, the term receiver is usually referring to
the pure VLC receiver (i.e. a camera and other video processing components), while
a video synchronization system (server) is an application that is developed on top of
the implemented VLC system using the provided Application Programming Interface
(API). This is because the implemented VLC prototype can be used for various types
of applications.
This chapter is divided into three sections. Section 3.1 presents the concept of VLC
and the rolling shutter effect. Section 3.2 explains an abstract of the architecture of the
synchronization system, while Section 3.2 is dedicated to the synchronization servers.
3.1 Visible Light Communication
VLC is an optical wireless communication that takes place on top of the illumina-
tion (i.e. visible) light [44]. In recent years, a growing research interest in VLC has
been seen among researchers [27]. The advantages of VLC have opened a new range
of applications in ubiquitous computing, IoT and wireless communications. The ad-
vancement of VLC has grown significantly with the growth of LEDs [44]. As LEDs are
suitable candidates for VLC transmitters [45], some of the advantages of VLC systems
are inherited from LED characteristics. The main advantages are the following:
(a) Safety: Visible light, as the most natural form of radiation, is known to be harm-
less to human body [46] [47]. The ability of LED lights to be modulated quickly
will ensure a level of immunity for the human psyche [11]. Moreover, compared
to other radio technologies VLC does not interfere with aircraft equipments and
medical devices [48].
10
(b) Availability: The deployment of dual-service VLC (lighting and data communica-
tion) applications becomes easier with the high availability of artificial luminance
light in humans’ environment [49]. The growing interest in solid-state lighting and
LEDs also helps this development [50].
In addition, exploiting VLC does not require a license or certificate of the radio
spectrum [48]. Furthermore, the massive usage of camera equipped devices makes
the availability feature of VLC stronger in our application.
(c) Efficiency: LED lighting is known to be more efficient compared to other sources
of lighting [15]. Long life expectancy, high humidity tolerance and minimal heat
dissipation are also other advantageous aspects of LED lights [49].
The receiving end of an optical transmission system is a light/photo detector. A
photodetector is a device that generates an electrical charge (i.e. current) when ex-
posed to visible light [27]. Although photodetectors can convert high rate light pulses
accurately, their sensitivity to background light noise results in a very low Signal-
to-Noise Ratio (SNR) [51], therefore the implementations should provide a level of
channel isolation. Complementary Metal-Oxide-Semiconductor (CMOS) and Charge-
Coupled Device (CCD) sensors in digital cameras are also examples of light detector
sensors [52] [53]. Nakagwa, M. and Haruyama, S. have invented a camera-equipped
cellular device capable of receiving VLC through the camera and a secondary light
receiving unit [54].
Data Source Modulator Light Source
Optical Medium
Light SensorDemodulatorOutput Data
Transmitter
Receiver
Figure 3.1: Data transmission through visible light.
Figure 3.1 shows how data transmission through visible light takes place. The
VLC transmitter modulates the input data using a modulation technique (explained in
Chapter 4.2.1) and transfers the modulated signals to the light source units (typically
11
LEDs). The modulated light then travels through the optical medium (i.e. free space,
water or other forms of transparent matter). On the receiver side, the modulated light
is captured and transformed into electrical signals using light sensor units (i.e. photo-
detector, CCD, CMOS, LED, etc.). These signals are then demodulated and decoded
into output data using dedicated hardware or software components.
Although LEDs are designed to emit light, they can also act as photodetectors at the
same time [55]. As suggested by Schmid S. et al. in [56] this feature can be exploited to
build bi-directional and cost-efficient optical communication channels. However, im-
plementing a VLC system using this feature requires precise synchronization between
send and receive phases and special circuitry for detecting the generated current by the
LEDs. On the other hand, a bi-directional communication can be achieved by design-
ing hybrid networks where the downlink if carried out by visible light and the uplink
is carried by WiFi [57] [58], infrared [59], etc..
The IEEE 802.15.7 standard [60] has defined three operating modes for VLC phys-
ical layer. The highest data rate in the standard is 96Mb/s. However, several studies
have reported higher data rates such as 575Mb/s in [61] and 875Mb/s in [62]. Never-
theless, the high data rates in the mentioned demonstrations are possible through using
photodetectors in the receiver and implementation examples that use a commercial
camera as the receiver are scarce.
Papers in [63], [64] and [65] demonstrate VLC systems that take advantage of high
speed cameras (i.e. 1000fps). However, high speed cameras are not widely available
on mobile phones.
3.1.1 The Rolling Shutter Effect
In order to set a digital camera as the VLC receiver, we utilize the Rolling Shutter
effect of the digital cameras. Although this effect was not intentionally designed to
be a feature in digital cameras, it can be exploited to enable communication through
visible light.
The image sensors built in digital cameras fall into two main categories namely:
CCD and CMOS [66]. In our application, the main difference between CCD and
CMOS technologies is their shuttering mechanism.
The pixels of a picture in CCD sensors are captured globally during one exposure
period, while CMOS sensors capture the scene by sweeping lines of pixels [67] (either
horizontally or vertically). Hence, the shuttering mechanism is called Global [68] in
12
CCD sensors and Rolling [69] in CMOS sensors. The energy-efficiency of CMOS
sensors [70] has made them the suitable candidate for battery powered devices such as
smartphones [71]. The rolling shutter effect becomes visible while the recorded video
contains a fast moving object such as a fan or helicopter propeller. An example of
the rolling shutter effect is shown in Figure 3.2 where the vertical lines on the moving
train (borders of doors and windows) appear to be tilted in relation to still objects in
the picture.
Figure 3.2: The rolling shutter effect on fast moving train.
Figure 3.3 shows how the rolling shutter effect captures a fast blinking LED. The
example shown in Figure 3.3 depicts the position of a horizontal rolling shutter in a
scene in respect to time. The rolling shutter scans the scene to form a picture consisting
of M rows. A fast blinking light source (i.e. LED) with a blinking period of T exists
in the background of the scene. This will cause the finalized picture to be divided into
sets of rows where each set of rows represents either on or off state of the light source.
Figure 3.4 shows the same action in real application where two blinking LED light
bulbs are captured by a digital phone camera. The on or off states of the LEDs appear
as dark and bright bands in the picture. Researchers have used this technique in order
to calibrate cameras for rolling-shutter rectification in [72].
13
Row 0
Row M
.
.
.
Resulting Picture
On State
Off State
Rolling Shutter
Position
Legend
Time
T
TPeriod of the blinking LED
Figure 3.3: The rolling shutter effect on blinking LEDs
Figure 3.4: Fast blinking light sources captured by the rolling shutter
Combining modulated visible light and the effect of the rolling shutter has been a
point of interest for researchers. For example, Rajagopal N. et al. in [73] presented
a technique for sending data from solid-state luminaries (i.e. LEDs) to rolling shutter
14
cameras on smartphones with a focus on communication channel using Frequency
Shift Keying (FSK). However, the server-end of their receiver is implemented using
Matlab which is not appropriate for a real-time system because of the inefficiency and
delay that Matlab imposes to the system. Moreover, offline recorded video contents
do not imply the characteristics of a real-time system. In addition, the camera-end of
their receiver is controlled by a specifically implemented application [74]. This way
the end-users lose their liberty to select their desired camcorder software applications
and are forced to use customized applications.
A similar approach in [75] is used to do accurate indoor positioning using LED
lights and smartphone cameras. In this technique, the LEDs are broadcasting a con-
stant landmark beacon while the camera captures pictures and sends them to a cloudlet
in which the images are processed in order to identify the landmarks. The accurate
position is then calculated based on the angle between a light source and the cam-
era. Moreover, Matsumoto Y. et al. have developed a CMOS sensor for positioning
purposes in visible light ID systems [76].
Moreover, researchers in [77] suggest a camera-based VLC system which uses
screen displays as the sender and transmits the data by modulating the density of color
channels (i.e. red, green and blue). Although this approach appears to be suitable for
barcode scanning [78] (e.g. Quick Response (QR) code), the limitations of display
screens in luminosity and refresh rate do not allow faster data transmission and further
advances in VLC systems built using display screens.
3.2 A Video Synchronization System
Figure 3.5 illustrates the architecture of the video synchronization system that uses
visible light as its reference for synchronization. The light sources are controlled by a
microcontroller which is placed between the power supply and light sources. In this
prototype a standard type of commercial LED light bulbs are used that can be plugged
into normal AC power sockets. Since LEDs are powered by DC current, the light bulbs
are equipped with a built-in AC to DC converter.
The microcontroller can be pre-programmed or can receive commands in real-time
through a serial connection (or an interface of the serial connection e.g. USB, WiFi,
etc.) from another device with human interaction, e.g. another microcontroller, PC
or handheld device. The microcontroller is then responsible for interpreting the com-
15
mands and translating the information into light-compatible pulses.
Signal Generator
Serial Connection
AC Supply
AC/DC
Converter
Smart Phone
Camera
Virtualized
Syncronization
Servers
Live Media
Stream
Line of
Sight
Media Stream
Players
Wired
Connection
Legend
Unsynchronized
Live Media
Stream
Synchronized
Live Media
Stream
LED Light Bulbs
Figure 3.5: Architecture of the video synchronization system
The transmitted information are then captured by smartphone cameras and streamed
to the synchronization servers through network. The synchronization server is placed
in a virtualized environment and receives the incoming multimedia streams. In the
meantime, the multimedia streams are broadcasted out of the server for playback. In
this way the receivers (whether a director in use cases explained in Chapter 2, an end-
user receiver or even an other broadcasting server) are not aware of the activities of the
middle servers, hence a level of transparency is provided at this stage. While the mul-
timedia streams are being streamed out, the system looks if any of the video streams
require synchronization. In case the video streams are out of sync, the system imposes
the needed amount of delay to the early stream. In this way, the outgoing video streams
get synchronized.
The isolation provided by the virtualized servers makes cloud environment a suit-
able platform for the deployment of this system. In addition, cloud computing can
provide a scalable and elastic environment for this application as the number of incom-
ing or outgoing video streams changes over time. This can be achieved by allocating
more virtual machines in the cloud environment as more users start uploading video
16
content to the cloud.
As explained in Chapter 2 the existing methods of video synchronization are not
suitable for a cloud-based and heterogeneous video streaming system. Hence, the
means of VLC are exploited to embed the necessary information required for video
synchronization in the video stream. The information sent by the light sources can
represent timestamps that can be used as checkpoints for the synchronizer server. Fig-
ure 3.6 shows how this process takes place.
Video stream
Decode frame
Process frame
Frame contains
checkpoints?
Video buffer
...
Read
frame
No
Send PTS to
main thread
Collect PTS data
Stream frame
Δt != 0
Impose
delay = Δt
PTS
Video
stream
PTS
Yes
Yes
From main
thread
From other
streaming threads
Figure 3.6: Inter-frame synchronization
In Figure 3.6, a streaming thread refers to a software thread which handles the
video streams. These threads are generated by the main thread (orchestrator) for every
incoming media stream. The incoming video frames are buffered, decoded and then
processed (explained in more details in Chapter 4.3). In case a video frame contains
the necessary transmitted information (a time stamp or a flag), the position of that
17
particular video frame in the video stream is used as a checkpoint for alignment with
other video streams. In this case the Play Time Stamp (PTS) of the frame is sent back to
the main thread. The main thread uses the returned PTSs to determine the actual time
difference between identical frames in separate video streams. This time difference (if
existing) is then used as reference for delaying the earlier media stream.
The codec structure of the video stream might provide several types of time stamps.
For example, in H.264, there are two types of time stamps available for a frame,
namely: Decode Time Stamp (DTS) and PTS. The PTS of a frame is a field in the
frame’s meta data section which indicates the time that the video frame should be
played back by the video player. Similarly, DTS refers to the time that the frame
should be decoded. Since some video frames might be used as a reference frame for
other future or past video frames, the value of DTS and PTS might not be the same.
The picture in Figure 3.7 illustrates an example of a closed Group Of Pictures
(GOP) in a video stream and Figure 3.8 illustrates an example of an open GOP. The
arrows in Figure 3.7 and Figure 3.8 are indicating the data dependency between each
of these frames. I-frames and P-frames are reference frames and B-frames are non-
reference frames. The λ value depends on the frame rate of the video stream (e.g.
33ms for a 30fps video stream). The ǫ value is here to make sure that the frame
is decoded before its playback time, in practice this value can be proportional to the
amount of time that it takes to decode one frame. For simplicity reasons, the starting
time for this GOP is assumed to be 0. However, in practice if this value is t (where
t =!0) the DTS values would be t − ǫ, t + λ − ǫ, t + 2λ − ǫ, t and so on. The same
pattern would also apply for PTS values.
I B B P B B P B B P
5λ 6λ3λ 4λλ 2λPTS: 8λ 9λ7λ
B P
10λ 11λ0
5λ-ε 3λ0
4λ-ελ-ε 2λ-εDTS: 8λ-ε6λ7λ-ε 10λ-ε
9λ0-ε
GOP n
Figure 3.7: Closed GOP
In an Open GOP the B-frames at the beginning of the GOP are depended on the
18
reference frames in the previous GOP.
I B B P B B P B B P
5λ 6λ3λ 4λλ 2λPTS: 8λ 9λ7λ 10λ 11λ0
5λ-ε3λ
04λ-ε
λ-ε2λ-ε
DTS:8λ-ε
6λ7λ-ε10λ-ε
9λ0-ε
GOP n
B BP IB B ...
GOP n+1GOP n-1
...
12λ 13λ 14λ 15λ
11λ12λ13λ-ε
14λ-ε
Figure 3.8: Open GOP
The PTS and DTS values are assigned to each frame by the encoder at encode
time. Moreover, the video frames might not be streamed in the same order as they
were encoded. It is more practical if the reference frames are sent sooner than the
other frames. This means that although the frame sequence after encoding phase is
IBBPBBP. . . , but the received frame sequence after network transmission might be
IPBBPBB. . . . Eventually, since there is no control over the video encoding and video
transmission phases, the PTS is selected as the main time stamp reference. The actual
time stamp has to be calculated based on the time base units of the video stream.
In order to impose the necessary amount of delay on the early video stream, the
responsible thread for the video streaming should skip outputting a certain number
of frames (equivalent to the required delay). This can be achieved by using a seek
method in the video buffer or repeating on sending the current video frame for a cer-
tain amount of time (i.e. freezing the picture). But manipulating the PTS/DTS fields
in video frame’s metadata might not work properly on all video players. This is be-
cause many video stream players assume that receiving a non-monotonically increas-
ing video stream is due to network packet loss, hence they ignore this time offset and
continue playing the video stream normally.
3.3 Summary
In this chapter we introduced the basics of VLC and explained how VLC can take
place through conventional digital cameras. This can be done by taking advantage of
the rolling shutter effect of digital cameras. Although the rolling shutter effect was not
19
a design intention at the beginning, its side effects can be harnessed for other applica-
tions. In addition, this chapter explained how video synchronization becomes possible
by the means of VLC. Finally, the initial set-up and architecture of the synchronization
mechanism was described. More details on implementation aspects of this design will
be presented in the next chapter.
20
4 DESIGN AND IMPLEMENTATION
This chapter explains the implementation details in three different subsections. Sec-
tion 4.1 describes the physical component requirements for setting up a controllable
LED lightening system. Section 4.1 also covers the details about considerations regard-
ing the high voltage characteristics of commercial LEDs. Other transmitting details
such as modulation, bandwidth and physical improvements are given in Section 4.2.
Finally, the aspects related to design and implementation of the receiver is given in
Section 4.3.
The implementation of this work is split into hardware and software parts. The
hardware design starts with setting up the electronics and LED light bulbs to form
a basic VLC transmitter. This part continues with interfacing and programming the
microcontrollers and defining the means of transmission in the physical layer. The
software implementation consists of video streaming (from smart phone cameras to
media servers and from media servers to external media players), bit pattern detection
and machine vision.
The performed improvements and optimizations in this work are also presented in
this chapter while the implementation for each part is being explained. Other optimiz-
ation methods and improvement suggestions that were out of scope of this thesis will
be presented in Chapter 6. Moreover, some performance optimization techniques are
given in Chapter 5.
This implementation applies outsourcing techniques to perform some operations
of the presented solution. A. Bingham and D. Spardlin in [79] explain how open
source can be beneficial as a tool for outsourcing. Hence, taking use of open source
software and hardware is one outsourcing technique that has been adopted in this work.
Although, it is crucial to perform several investigations to manage diversity and risk
sharing in the process of outsourcing [80], this thesis will not be covering the whole
process of investigation in details.
C and C++ languages are selected as the main programming languages. The open
21
source API of FFMPEG [81] is used to perform the network based streaming oper-
ations and transcoding computations. OpenCV [82] libraries are utilized to perform
machine vision operations. The main hardware components are built using the Ardu-
ino [83] open source platform. The lightening components are consisted of commercial
LED light bulbs (IKEA Ledare).1
4.1 Physical Layer and Devices
This section describes the electronic components and circuitry needed to implement
the infrastructure for the transmitter end. In order to be able to control the light bulbs
in a desired manner and send the information through them, a micro-controller or any
other central/distributed logical controller is needed. The high voltage/power nature of
commercial LED light bulbs requires a level of safety and isolation from the controller
side of the physical layer. This causes the circuitry for this project to be more complex
than a common micro-controller driving a small LED.
4.1.1 Electronic Components
An important feature of VLC is its ability to embed the communication within the
ambient lighting of the environment [84]. In order to exploit this feature and provide
enough luminance for this communication, high power LED light bulbs are advised to
be used. This prototype uses the IKEA Ledare E27 LEDs which can produce 400 units
of luminance. Figure 4.1 shows a sample Surface Mount Device (SMD) LED plate of
the same light bulb.
1http://www.ikea.com/se/sv/catalog/products/70266765/
22
Figure 4.1: SMD LED plate of IKEA Ledare E27.
Figure 4.2 shows basic circuitry for modulating a low power LED. However, nor-
mal micro-controllers are not able to provide enough electricity power to drive our
targeted high power LED light bulbs as these micro-controllers function under low
voltage levels. Therefore a middle component is needed between the micro-controller
and the LEDs with the ability to be pulsed by the micro-controller and also drive the
LEDs.
1KΩµC
Generated Pulse
Figure 4.2: Circuit schematic for modulating low power LED.
A common component to connect high voltage circuits to low voltage circuits is re-
lay [85]. However, the electro mechanical nature of relays makes them too slow to sup-
port high frequency pulsing, hence this task should be done by a fast electronic com-
ponent such as Bipolar Junction Transistor (BJT) [86] or Metal-Oxide-Semiconductor
Field-Effect Transistor (MOSFET) [86] transistors with support for high voltage and
high switching speed. In case the LED package comes with an electronic built-in
or separate driver, the micro-controllers can be simply placed between the driver and
LEDs. Otherwise, all low-speed electric components must be detached and replaced
with CMOS-based equivalent circuits. The Darlington Pair [87] is a well-known pair
of transistors consisting of two BJT transistors which behaves similar to a single tran-
sistor but with a high current gain. In our implementation high voltage TIP120 [88]
23
ICs are used. These ICs implement the Darlington Pair as shown in Figure 4.3.
B
7KΩ
70Ω
C
E
Figure 4.3: Darlington Pair implemented in TIP120 ICs.
4.1.2 Circuit Isolation
An optocoupler is used to isolate the low power circuits (i.e. microcontrollers) from
the high power section. Optocoupler (or opto-isolator) is an electronic component that
transfers electronic signals between two isolated circuits by the means of light [89].
The optocoupler (shown in Figure 4.4) has two main components: an LED that
converts electronic signals to light and a phototransistor. A phototransistor [90] is a
transistor (similar to a photodetector explained in 3.1) sensitive to light, meaning that
its base connection can be triggered by light wavelengths rather than electricity. In this
implementation the CNY17 [91] ICs are used for isolation purposes.
LED
Electric Insulator
Photo-
transistor
Input Signal
Output Signal
Vcc
GND
Figure 4.4: Optocoupler.
The CNY17 package provides an isolation voltage up to 5000 volts, the input can
24
be triggered by 1.4 volts (which is enough to be provided by an Arduino output pins)
and the switching characteristics are fast enough to support the needed modulation
speed in our application.
4.1.3 Circuitry
The complete circuit of the transmitter is shown in Figure 4.5. The microcontroller
used in this prototype is an ATmega32 [92] that is mounted on the Arduino Uno [93]
board. Arduino Uno is a single-board microcontroller that provides easy-to-use proto-
typing functionalities.
The board has a built-in ATmega16U2 [94] microcontroller that provides serial
connectivity to the ATmega32 chip through USB port. Moreover, the API libraries
provided in this work can interface the serial connectivity. This makes the application
development more portable and allows the developers to be able to communicate with
the ATmega32 microcontroller through other types of channel as well (e.g. Wi-Fi, BT,
etc.). Examples of usage of the API are given in Appendix A.2.
ArduinoUNO
GND
18
12Optocoupler
CNY171KΩ TIP120
B C E
Vcc: +40VVcc: +5V
Array
of
LEDs
GND
ATM
EG
A32
16U2
USBPower Jack
1KΩ
1KΩ 1
KΩ
GNDGND
Figure 4.5: Circuit schematic for the transmitter.
The role of resistors in Figure 4.5 is to cancel any unwanted noise at the gate of
the transistors (a.k.a. current-limiting resistor [95]). Depending on the threshold value
of the transistor, a small amount of noise might trigger the gate and switch on/off the
transistor. A picture of hardware components of the prototype is shown in Figure 4.6.
25
ATMEGA32 Mounted
On An Arduino
Light Bulb
Cover
Array of LEDs
Tip120 ICs
AC/DC
Converter
Optocouplers
Figure 4.6: Hardware components of the prototype.
4.2 VLC Transmitter
This section covers the aspects related to the definition, selection and transmission of
carrier frequencies. Other aspects such as quality of experience and bandwidth that are
directly related to the selection of carrier frequencies are also surveyed in this section.
This section starts with explaining the modulation techniques that are used (or have
been researched) for VLC systems. After selecting a modulation technique that fits the
requirements of a camera-based VLC system the bandwidth constraints are explained
followed by explanation of frequency selection process. Finally, we explain how ma-
nipulating some characteristics of the modulating frequencies can improve the quality
of experience. Some tutorials on usage of the implemented APIs for transmission are
given in Appendix A.1 and A.2.
26
4.2.1 Modulation Techniques
Modulation is a key factor for a VLC system. The proper selection of the modula-
tion technique can improve the performance and accuracy of a VLC system [96] [97].
There are several methods of modulation suggested in literature with comparison of
their advantages and disadvantages. However, not all of the modulation techniques are
suitable for a camera-based VLC system.
Authors in [11] categorize modulation techniques for VLC into three groups: On-
Off Keying (OOK), Pulse Position Modulation (PPM) and Color Shift Keying (CSK).
OOK being the simplest one is based on the brightness intensity of the light (high
density light luminance represents logical one and low density or off state represents
logical zero). The combination of OOK modulation and a coding scheme can provide
beneficial utilization of bandwidth [96]. However, the impact of ambient noise makes
OOK modulation unsuitable for non-isolated environments and camera-based VLC
systems. PPM variations are based on the pulse position. PPM techniques are generally
more complex to implement due to synchronization requirements of the sender and the
receiver [98]. Hence, PPM variations are not suitable for camera-based VLC systems.
However, they are widely used in systems with photodetectors [99] [100]. CSK is
achieved by modulating the color components of the white light. The limitations reside
in both sender and receiver side. The LED lights should support different colors and
provide an interface for the colors to be manipulated. The receiver should also be able
to detect different colors and demodulate the transmitted information.
Spatial Modulation [101] is a new modulation technique which is done by mapping
the bit patterns into constellations of light sources detectable by the receiver. The trans-
mitter and receiver have to be placed in a constant position in space without movement,
therefore this technique is not applicable for our purpose.
Because of the characteristics of the rolling shutter (explained in Chapter 3.1.1)
and LEDs the Frequency Shift Keying (FSK) is the best candidate for modulation in
our implementation [102] [73]. The OOK, PPM and FSK modulation techniques are
shown in Figure 4.7.
27
OOK
Time
1 0 0 0 01 1 1
FSK
Time
1 0 01
PPM
Time
Clock
1 0 1 0
Figure 4.7: Modulation techniques
In FSK different frequencies are used to represent different logics and symbols.
A blinking light has a frequency corresponding to the intervals between its "on" and
"off" states. Various sets of frequencies are used to present different bit patterns. On
the receiver side a Fourier Transform on the received image can help determining the
magnitude of the present frequencies in the frequency spectrum. More details on Four-
ier Transform is given in Section 4.3.3.
28
4.2.2 Bandwidth Limitations
This section explains the practical restrictions that limit the bandwidth of the trans-
mission frequencies. These restrictions are categorized into two groups, the first being
limitations that are imposed by the characteristics of the human body and the second
being limitations caused by video compression techniques. Both categories are ex-
plained in the following.
The second category of restrictions are caused by the limitations in the receiver
side. However, these limitations impose some design regulations that has to be dealt
with in the sender side. Therefore, in order to explain the design regulations related to
the transmitter some of the results acquired by the receiver are presented in this section.
The receiver side is explained in details in Section 4.3.
Restrictions Caused by Human Eye
A light source which blinks faster than a higher rate seems to be constantly on by the
human eye. This fact benefits the VLC to happen by being embedded in the existing
ambient light of the environment. However, a blinking light source with a frequency
lower than a certain threshold can be detectable by human eye [103] [104] [105] [106]
which can cause negative or harmful physiological changes in humans [11].
In order to avoid flickering, low frequencies should not be used as a carrier for in-
formation transmission. This will limit the lower band of the transmission bandwidth.
Although there is no consensus on what this limit should be [107] [108], recent studies
in [109] suggest that any frequency above 200Hz is considered to be safe. However in
our experiments frequencies below 1kHz were easily perceptible. Hence, 1kHz is set
to be the lowest carrier frequency used in our system.
Restrictions Caused by Video Compression
The bandwidth is more limited when the receiver sides in the post-compression stages
of the video stream. The side effect of the video compression is because many of the
common video codecs (e.g. H.264) use a block-based video compression technique
[110] by applying a low-pass filter which leads to a blurry effect in the portions of
picture that contain high frequencies.
Although the sample rate of the camera (one row of pixels at a time) defines the
highest possible frequency (i.e. 10.8kHz in our settings), narrowing the period of the
29
flashing lights down to two rows of pixels (one row for dark band and one row for
bright band) does not provide accurate results. Using very high frequencies leads to
having gray pixels instead of precise dark and bright ones. Equation 4.1 shows how the
maximum frequency for a horizontal rolling shutter can be calculated. In Equation 4.1
R is the number of rows in one video frame and Fps is the frame rate of the video
stream. The reason that this value is multiplied by 2 is that half of the period (one row)
is dedicated to bright band and the other half is dedicated to dark band. For example
in a 30fps video with 720p resolution the maximum blinking frequency for the LEDs
will be 10.8kHz.
Fmax =2
Fps ×R(4.1)
However, in our experiments the highest on-off keying frequency could not exceed
3.2kHz due to the side effects caused by video compression. Figure 4.8 shows this ef-
fect in practice. As it can be seen in Figure 4.8a, the portion of picture that contains the
high frequency data, the dark and bright bands are merged together causing a blurry
result which cannot be detected in the Fourier Transform of the picture. The Fourier
Transform of the same picture is shown in Figure 4.8b. On the contrary, the picture
shown in Figure 3.4 depicts a clear shot of the carrier frequency (1.8kHz in this ex-
ample) where dark/bright bands are distinguishable. The FT of the video frame shown
in Figure 3.4 is presented in Figure 4.9 where the existence of the carrier frequency
appears as bright dots in the corresponding frequency bins.
30
(a) Blured frame carrying a 3.5kHz signal
(b) Fourier Transform of the frame shown in (a)
Figure 4.8: Block-based video compression effect
31
Figure 4.9: FT of the frame shown in Figure 3.4
Moreover, the intra-frame motion estimation mechanisms used in video encoders
[110] causes another side-effect that can affect the accuracy of the receiver in detecting
the bit patterns. Our experiments reveal that motion estimation can introduce parasite
frequencies in non-reference frames (i.e. P and B frames) leading to a higher SNR. The
red marked rectangle in Figure 4.10 shows a portion of the frame in which the previous
frame (which was also a reference frame) was carrying a frequency. A comparison
between this portion and the rest of the frame shows the after-effect of the block-
based motion estimation. The Fourier transform of the frame in Figure 4.10a is shown
in Figure 4.10b. The red marked circles show the notch effect caused by the block-
based motion estimation. The marked notched points might confuse the receiver by
indicating the existence of a frequency.
32
(a) Non-reference frame coming after a high frequency carrying frame
(b) Fourier Transform of the frame shown in (a)
Figure 4.10: Block-based motion estimation effect
4.2.3 Frequency Selection
As explained in Section 4.2.2, the lower and upper bounds of the bandwidth are restric-
ted by many factors. In this prototype the carrier frequencies can be selected between
1kHz and 3.2kHz. In spite of that, there are few other factors that have to be taken into
33
account when choosing the carrier frequencies.
One key point in this process is considering guard bands. Guard bands are gaps
placed between the adjacent frequencies in order to prevent Inter-Symbol Interference
(ISI) [111]. This concept is shown in Figure 4.11. Consequently, the presence of guard
bands will consume more bandwidth [112].
Frequency domain
Magnitude
Guard band
f1 f2 f3
Figure 4.11: Guard intervals.
Authors in paper [75] suggest a 200Hz guard band between adjacent frequencies
for a VLC system based on mobile cameras. Given the limitations mentioned earlier,
theoretically there are ten separate frequencies that can be used in this implementation.
But, as explained in the following, in practice the transmission is vulnerable to more
limiting factors that has to be taken into account.
Another key point in frequency selection is the effect of harmonic frequencies.
Taking a fundamental frequency with value f , the harmonic frequencies are 2f , 3f
and so on [113]. For example, a fundamental frequency of 1kHz (first harmonic being
itself) has a second harmonic frequency of 2kHz, third harmonic of 3kHz and so on.
Generating a high frequency square wave such as 1kHz will naturally lead to genera-
tion of it’s odd harmonic frequencies (in this example 3kHz, 5kHz, etc.) as well, but
with a lower energy [114]. Figure 4.12 shows this phenomenon in time domain and
frequency domain.
34
Transmitted signal
Time
Frequencyf 3f 5f 7f
Received frequencies
Time
Frequencyf 3f 5f 7f
Figure 4.12: Odd harmonic frequencies in a square wave.
One should be careful when selecting the carrier frequencies, because for example
intending to transmit a 3kHz signal at the sender might cause the receiver to detect
3kHz and 1kHz mistakenly. Proper threshold definition on the magnitude property of
the received signals can solve this problem to some extend, however this problem gets
much more complicated when a combination of different frequencies are used to rep-
resent one symbol. This phenomenon is known as harmonic distortion and measured
by Total Harmonic Distortion (THD) [115].
Eventually the number of applicable frequencies is reduced to five. The set of
defined frequencies for this implementation is given in Table 4.1.
Index Frequency Period/2 Next guard interval
1 1.077kHz 464µs 440Hz
2 1.51kHz 330µs 480Hz
3 1.99kHz 251µs 410Hz
4 2.4kHz 208µs 620Hz
5 3.02kHz 165µs ∞
Table 4.1: Defined frequencies
4.2.4 Lifetime of Frequencies
The lifetime of a frequency refers to the time period that the light sources should blink
with that frequency. The frequency lifetime has an impact on accuracy and robustness
35
of the system, therefore proper selection of this value is necessary.
Long frequency lifetime reveals a higher magnitude per frame in the receiver side.
This is because of the increased SNR in the communication channel. If the receiving
carrier frequencies are distinguishable with a high magnitude contrast compared to
other (noisy) frequencies, the thresholding mechanism becomes easier and more robust
which results in a more accurate throughput. Thresholding is explained in more details
in Section 4.3.2.
In this work, the minimum value for the frequency lifetime is set to be two times
larger than the minimum detectable value by the receiver. Meaning that if the receiver
is able to detect frequencies with a minimum lifetime of λ, then the actual dedicated
time for a frequency will be set to 2λ. The reason is that the computational unit of
this implemented system is one video frame and there is a possibility that a transmitted
frequency will not be able to completely fit into one frame. Therefore, in the worst
case scenario it is possible that half of the lifetime of the frequency gets captured in
the nth frame and the other half gets captured in the n+ 1th frame. Hence, the receiver
should still be able to detect the transmitted frequency in one frame. This scenario is
shown in Figure 4.13. A solution to this problem is discussed in Chapter 6.
36
f1
lifetime λ/2
f1
lifetime λ/2
nth frame, detected frequencies: none
(n+1)th frame, detected frequencies: f2
f2
lifetime λ
Figure 4.13: Breaking of a frequency in two frames.
In our experiments, the minimum lifetime of a frequency could be narrowed down
to 4.125 milliseconds (i.e. one eighth of a frame taking 33 milliseconds). But be-
cause the number of available frequencies is limited to six, the 2λ value is set to 5.5
milliseconds (i.e. one sixth of a frame taking 33 milliseconds). Filling this gap by
lengthening the lifetime of frequencies increases the energy of signals and leads to a
higher SNR.
4.2.5 Flicker Improvement
One challenge in embedding VLC in surrounding ambient light is to avoid flicker and
change in illumination. The flicker in this case does not only mean the speed of flashing
37
light (explained in Section 4.2.2) but also a short change in illumination can also be
perceived as a flicker.
The reason is because although human eye cannot detect the on-off states of a fast
blinking light source, the blinking effect will be perceived as a constant beam of light
with a luminance equivalent to the average luminance of "on" and "off" states of the
light source. This means that if a blinking light source is on for 50% of the time with
its highest luminance capacity of Lmax, then the perceived constant luminance will be
Lmax/2. Using the Equation 4.2 [116], this method (known as Pulse Width Modulation
or PWM) has been in use for dimming purposes [11] where altering the current is not
possible [117].
Lv = Lmax
τonT
(4.2)
In Equation 4.2, T is PWM period, Lmax is the maximum luminance (100% duty
cycle), τon is the PWM pulse duration when the LED is on and Lv is the resulting
luminance. Obviously using frequency modulating techniques for VLC purposes will
cause a change in the level of perceived luminance and in case the communication
period is short, this change in luminosity will be perceived as a flicker.
There has been a lot of research done to provide a solution for the flicker prob-
lem and also to provide dimming support for VLC based LED light sources. Au-
thors in paper [118] suggest a modulation method called Pulse Dual Slope Modulation
(PDSM) for flicker improvement in VLC. PDSM is a variation of Pulse Slope Mod-
ulation (PSM) which modulates the signal by changing the slope of the leading edge
of the pulse. However, PSM modulation techniques do not seem to be appropriate
for camera-based VLC systems. Overlapping Pulse Position Modulation (OPPM) is
suggested in [97] as the best candidate for joint dimming control and high data-rate
communication, but PPM modulation families require sender and receiver to be highly
synchronized.
One way to improve flicker when using FSK as the modulation method is to modify
the duty-cycle of the modulating signal without changing its frequency [119]. This
means that when a frequency shift in the carrier frequency happens, the duty-cycle for
the new frequency should be changed in a way that could compensate for the bright-
ness alternation caused by the frequency change. The picture shown in Figure 4.14
illustrates this method. Note that L in Figure 4.14 refers to the maximum luminance
that LEDs can support in a 100% duty-cycle.
38
Frequency: 3F/2
Luminance: L/3
Duty-cycle: 50%
Frequency: 3F/2
Luminance: L/2
Duty-cycle: 62.5%
Frequency: F
Luminance: L/2
Duty-cycle: 50%
Figure 4.14: Flicker compensation using duty-cycle modification.
Figure 4.14 shows that increasing the frequency can reduce the perceived bright-
ness and how increasing the duty-cycle by 12.5% can compensate for the brightness
loss. However, the limitations of this methods and its effect on camera-based sensors
should be studied more. For example, the author in [119] claims that a duty-cycle
higher than 75% can lead to erroneous sampling output.
Another method to improve the flicker effect is to add a DC bias to the transmitting
signal [120]. Meaning that the logic zero would not be represented by the off state
of the LEDs but by a state with a reduced distant to the on state. Nevertheless, this
method can highly reduce the SNR and accuracy.
4.3 VLC Receiver
This section explains the design and implementation of the receiver. It starts with giv-
ing an abstract overview of the camera-based VLC receiver. The video synchronizer
software (explained in Chapter 3) is developed on top of this platform as a use case.
Furthermore, the machine vision techniques used in this implementation are also ex-
plained in this section.
39
4.3.1 Architectural Overview
As explained in Chapter 3 of this thesis, the receiver side of the system consists of two
parts. The first part is the smartphone camera which captures videos and streams them
for further processing. The second part receives the live streams from the cameras,
handles the machine vision tasks and performs the targeted application (i.e. video
synchronization in this case). This part sits in the server side of the system.
The flowchart shown in Figure 4.15 illustrates the procedure implemented in the
receiving server. The procedure starts by initializing the network configurations and
registering the necessary media codecs in the system. For scalability purposes, the
system creates an individual thread for each incoming video stream. The incoming
video stream is buffered in the memory until a decodable frame is ready. After the
video frame is decoded, it will be converted to YUV format as the image processor
only needs the Y plane (luminance) of the picture. The buffering, decoding and con-
versions are implemented using FFMPEG libraries. Meanwhile during buffering time
the same video stream is being restreamed and broadcasted outside of the system. In
case the quality enhancement techniques (explained in Chapter 5) are to be used, the
restreaming should be performed by encoding the previously decoded frames.
40
Initialization
Define receiving ends
Thread generator Video buffer
Picture conversion
Image processor
Video decoder
Message handlerBroadcast handler
Start
OnReceive
End
Figure 4.15: Receiving server.
The Y plane section of the picture is then converted to a matrix compatible with
OpenCV’s Mat data type. Using OpenCV API the matrix is then converted to its fre-
quency domain by a Discrete Fourier Transform (DFT) algorithm. These steps are
shown in Figure 4.16. Once the frequency domain is obtained, the post-DFT compu-
tations take action and the system looks for the magnitude of the desired frequencies
41
using a dynamic thresholding algorithm explained in Chapter 4.3.2. Post-DFT compu-
tations include centralization process (explained in Section 4.3.3), separation of mag-
nitude and phase planes (the result of DFT function is a complex number representing
the magnitude and phase components of the signal), and a logarithmic scaling to scale
the magnitude values between 0 and 1.
Video buffer
...Network packets
Decode frame
Convert to YUV
Separate Y plane
Convert to Mat
Time
domain
Frequency
domain
DFT
Decode message
Post-DFT
computations
Figure 4.16: Picture conversion and image processing.
The detected frequencies are then handed over the message handler where the mes-
sage decoding takes place. The messages are then passed to the broadcast handler. The
decisions -based on the received messages and the application- are made in this part.
In the case explained in Chapter 3.2 (i.e. video synchronization), the broadcast hand-
ler decides about delaying the necessary streams in order to provide synchronization,
and the appropriate delay will be calculated at this stage if necessary. Because FSK is
selected as the modulating technique, the message decoder should make a set of found
frequency elements and compare it to a pre-defined look-up table in order to decode
the transmitted symbols. In the example shown in Figure 4.19, the detected frequen-
cies are f1, f2 and f3 which can be (depending on the protocol) translated to 111 in
42
binary or 7 in decimal. An example of the lookup table in this implementation is given
in Appendix A.3 of this thesis.
4.3.2 Thresholding
In order to determine whether a picture contains a certain frequency or not, it is neces-
sary to evaluate the magnitude (a.k.a. energy or power) of the signal representing that
frequency. This evaluation can be done by comparing the magnitude value to a pre-
defined threshold value. As explained previously in this chapter, static thresholding
does not result in accurate measurements, therefore a dynamic thresholding mechan-
ism is used in this work.
The dynamic thresholding here means that instead of following a static approach
for every frame, the threshold value is determined for each frame by considering the
maximum range values, minimum range values and the quantity of these values in each
frame. This method (shown in Figure 4.17) was selected because transmitting the same
symbols in different takes (e.g. in different environments) does not always result in
exactly the same values in frequency domain in all cases (although the recorded takes
might look very similar to human eye). This is because the transmitted frequencies are
not immune to the noise imposed by the ambient background light and other elements
in the scene, and the non-deterministic nature of the captured videos can contribute
to this phenomenon significantly. This way of thresholding also helps neglecting the
effect of harmonic frequencies as explained in Section 4.2.3.
The process start by excluding a set of maximum and minimum values out of the
main set of frequencies. The median value of the remaining set is then used as an an-
chor to set the threshold value. Calculating the mean value in this stage is not advisable
because the high quantity of small and noisy signals might cause the anchor to fall in
favour of smaller signals. In this implementation, the threshold is set to be in the 20%
range above the median value.
43
Magnitude
Frequency
domain
Max values
Min values
Median value
Threshold value20%
Figure 4.17: Finding the threshold value.
However, extra care should be taken when evaluating the maximum values because
the DC coefficient component of the frequency domain might be mistakenly considered
as a maximum value, resulting in a poor threshold value selection. We solve this
problem by applying a notch filter on the DC bias bins of the matrix obtained after
centralizing the DFT. These steps will be explained in the following section. The high
value DC components are caused by sharp edges and borders of the picture.
4.3.3 Discrete Fourier Transform
The picture shown in Figure 4.18, represents a video frame carrying three different
signals. The key phenomenon in camera-based VLC applications is that the width
of dark and bright bands in the picture is proportional to the blinking frequency of
the LED light sources. In case the system is able to detect how often a dark/bright
band is repeated in the picture, it can roughly estimate the blinking frequency of the
transmitter.
44
f3
f2
f1
Figure 4.18: A frame carrying three frequencies.
To provide this ability, an implementation of the Fourier Transform algorithm
should be used. A Fourier Transform function as shown in Figure 4.19 takes a sig-
nal in time domain and gives back its converted frequency domain. The Equation 4.3
defines a two dimensional DFT [121]. In Equation 4.3, f is the signal in time domain
and m = 0 . . .M and n = 0 . . . N are the coordinates. The resulting Fourier transform
(F ) is also a two dimensional matrix of the same size.
F (m,n) =1
√MN
M−1∑
u=0
N−1∑
v=0
f (u, v) .e−2πi(mu
M+nv
N) (4.3)
There are several different algorithms and implementations of Fourier Transform.
In this work, the DFT function in the OpenCV’s API is used to get the desired result.
45
Time domain
Magnitude
f1 f2 f3
f1 f2 f3Frequency
domain
Threshold
DFT
Figure 4.19: Time to frequency conversion.
As explained earlier, in order to filter the DC components with a notch filter, it is
more convenient to centralize the resulted spectrum matrix [122]. This process starts
by dividing the spectrum matrix into four quadrants and ends by swapping these quad-
rants diagonally. The centralization process is demonstrated in Figure 4.20. In this way
the DC bias values gather in the center of the matrix and the transmitted frequencies
line up in the middle column.
46
Q1 Q2
Q3 Q4
Q4 Q3
Q2 Q1
Spectrum matrixDC DC
DCDC
DC
Figure 4.20: Spectrum centralization.
Digital pictures are considered as two dimensional matrices, therefore it is common
in machine vision techniques to perform a two dimensional DFT on pictures as well.
However, authors in paper [73] argue that performing a one dimensional vertical Four-
ier Transform is sufficient for VLC-based purposes because turning the camera will
not affect the rolling shutter effect and the captured dark/bright bands in the picture
will always remain horizontal.
This argument becomes reasonable by doing comparative research on computa-
tional and memory complexity analysis of the two methods. The computational com-
plexity of two dimensional DFT (assuming width = height) is O(N3) (which can
be reduced to O(N2log2N) if Fast Fourier Transform (FFT) is used) [123], while the
complexity of a one dimentional DFT is O(N2) and O(Nlog2N) with FFT [124].
Nevertheless, the authors in paper [73] do not suggest any techniques on how to
perform the one dimensional Fourier transform on a two dimensional matrix nor study
the trade-off between 2D/1D Fourier transform and accuracy (i.e. SNR value). Con-
version from two dimensional matrix to one dimensional vector (array) could be an
approach to exploit the efficiency of using one dimensional Fourier transform, yet how
this conversion should be performed is still a challenge. One way to convert the two
dimensional matrix into a vertical array can be computing the average value of each
row (or average of samples of each row) and storing that average value as a vector ele-
ment representing the corresponding row. However, the impact of background objects
might cause bright bands to form a gray pixel in average and decrease the SNR value.
47
OpenCV’s DFT function will select an implementation of FFT in case the matrix
size would make FFT faster than DFT [125]. This condition is satisfied when the
size of width and height of the matrix is a power of the number two (i.e. 2, 4, 8,
etc.). To take advantage of this property, the matrix is padded (if necessary) with extra
columns/rows with zero valued elements (i.e. black pixels). This process is called
zero padding [126] [127]. The picture shown in Figure 4.21 demonstrates the two
dimensional Fourier transform of the picture shown in Figure 4.18 using OpenCv’s
DFT function.
Figure 4.21: DFT conversion of the picture shown in Figure 4.18.
The brightness of each pixel in Figure 4.21 represents the magnitude of the fre-
quency in the corresponding bin. In case the camera has a vertical rolling shutter, the
shining dots will be located on the middle horizontal line of the frequency spectrum.
4.4 Summary
In this chapter we explained the details of the design and implementation process of
the camera-based VLC system. This chapter started by describing the necessary hard-
ware components that are needed to form the physical layer of the communication.
Furthermore, we explained the transmission concepts and issues such as modulation
techniques, bandwidth constraints and the process of carrier frequency selection. Fi-
nally, the design and implementation details of the camera-based VLC receiver were
48
presented. The implemented receiver mainly consists of video stream handling and
image processing modules. In the next chapter we evaluate the implemented system
by performing actual tests and benchmarking.
49
5 RESULTS AND EVALUATION
Performance and quality are key factors in evaluating a real-time system. Midway
computations in a live multimedia service can lead to intolerable latency and/or distor-
ted quality. For example, in a video streaming service it is necessary that the video can
be played back in the desired frame rate, therefore it is needed to optimize or eliminate
the blocking computations that can lead to an unacceptable experience.
This chapter evaluates the implemented prototype under different circumstances
such as performance and quality by analysing the behaviour of the implemented soft-
ware. Few relevant suggestions and techniques for further improvement are also given
in this chapter. More details on future work possibilities are discussed in Chapter 6.
5.1 Evaluation Setup
In order to provide an isolated test environment for the implemented VLC system
it is necessary to feed the synchronization servers with video streams that actually
have unaligned identical frames. Figure 5.1 illustrates this scenario. To provide the
fake delay between the video streams three seconds of blank frames are added at the
beginning of one of the pre-captured test videos. Moreover, a time stamp counter is
fused to the video frames in order to give a notion of time and synchronization events
to the tester (i.e. human eye).
50
1234
1234
1 2 3 4 5
1 2 3 4 5 5 5
6
6
56
5678
Video
stream 1
Video
stream 2
Video
Synchronizer
Video
stream 1
Video
stream 2
Synchronization point
Unsynchronized
playback
Synchronized
playback
Observer
Figure 5.1: Test scenario for the video synchronizer
In order to simulate the live video streaming environment for the synchronizer the
pre-recorded videos are streamed over the network using FFMPEG software. In this
way, the synchronizer can treat the video streams as they were live. Figures 5.2 shows
a screenshot of an example were two incoming video streams are played back in an
unsynchronized manner. The resolution of the videos in this experiment is 720p (i.e.
1280x720 pixels) and the frame rate is 30fps.
The example shown in Figures 5.2 indicates that approximately a delay of 3 seconds
exists between the two streams. Figures 5.3 show the screenshots of the same video
streams after being synchronized. The console log of the synchronizer server indic-
ates the detection of checkpoint information at around the 6th second of the first video
stream and 9th second of the second video stream. Hence, the synchronizer can calcu-
late the exact delay between two video streams (i.e. around 3.5 seconds) and impose
the necessary amount of delay to the earlier video stream.
51
Video stream 1
Video stream 2
Figure 5.2: Unsynchronized video streams on playback
5.2 Evaluation Benchmarks
In this section the performance analysis results are presented. Optimization and accel-
eration techniques are discussed in the next section. The characteristics of the testing
platform is summarized in Table 5.1. The CUDA programming model is exploited to
harness the features of GPU manycore units.
Processing Unit Number of Cores Speed Memory Vendor
CPU 4 2.6GHz 3.7GB Intel
GPU 96 1.25GHz 1GB nVidia
Table 5.1: Platform characteristics
The performance profiling results of the Intel platform are illustrated as a pie chart
52
Video stream 1
Video stream 2
Console log
Figure 5.3: Synchronized video streams on playback
in Figure 5.4. It can be seen that OpenCV’s DFT function is the most time consuming
followed by decoding and lookup operations.
DFT operation takes the first place by taking more than 53% of the application
execution time. Video decoding operations, stream handling and codec conversions
are (all performed by FFMPEG libraries) categorized into one group, taking 26.4% of
the whole execution time. Element lookup in Figure 5.4 refers to element-by-element
tracing of the magnitude matrix which results from the DFT operation. This operation
which is in total twelve times shorter than the DFT operation is ranked in the third
place. Any other operation that takes less than 4.5% of the whole application’s time
is categories under the Other group. This group mainly consists of other OpenCV
related operations (such as logarithm operations and type casts), thread handling and
other small calculations. Conversion of the video frames into OpenCV’s matrix data
structure also belongs to this category.
53
DFT
53.7%
Decoding and
codec conversions
26.4%
Other
15.4%
Element
lookup
4.5%
Figure 5.4: Performance profiling results
5.3 GPU Acceleration
Task consolidation is a key method in optimization and resource utilization in cloud
environments [128]. One way to achieve higher performance is to transfer data-parallel
tasks to co-processor units such as GPU [129]. The high computational requirements
of DFT/FFT operations and the advantages of GPGPU in performing data-parallel
tasks provides enough motivation to migrate the DFT operations to GPU units [130]
[131].
Table 5.2 shows the benchmarking results on different processing units. The µs/f
(microseconds per frame) unit in Table 5.2 indicates the time each task (or a set of
tasks) takes to be performed on one video frame. It can be seen that migrating DFT
computations to GPU units can speed up the performance by 62%. This comparison
also includes the time each frame takes to be uploaded into GPU memory and the time
it takes to download the results back to CPU memory.
Moreover, the data transfer between memory spaces can be reduced by 50% in
case the necessary calculations on DFT results are done by GPU. In this way there is
no need for huge matrices to be downloaded to CPU memory space again. At the time
of writing this thesis we were unable to get coarse access to matrix elements in GPU
54
memory space, therefore the matrices had to be downloaded back to CPU memory for
further analysis.
Computation Time
Method GPU CPUGPU Upload +
Download TimeGain
DFT5927
µs/f
24940
µs/f
3536
µs/f62,05%
Table 5.2: Results of GPU acceleration
Another advantage of acceleration with GPU is the ability to gain more perform-
ance by pipeline computing. The details of pipelining methods will be presented in
Chapter 6.2.
5.4 Quality
It is necessary to measure the impact of distortion caused by the transmitted VLC data
in the recorded media. Although the data transmission is not sensible by human eye in
the physical environment [132], the data transmission might become noticeable on a
playback of the recorded video. On the other hand, the accuracy of the system on the
receiver side can/should be increased by channel coding techniques such as Forward
Error Correction (FEC) [133] at the physical layer. These techniques usually require
allocating extra bits [134] which results in more dedicated space in the bandwidth.
Although there is a trade-off between the system accuracy and the quality of the
recorded video content, if the VLC transmission period is shorter than a certain amount
of time and the transmission intervals are long enough, the quality of the captured
video data would not be degraded by any distortion. The impact of distortion can be
measured by the proposed formula given in Equation (5.1):
QoS =t× α
Fps × τ(5.1)
wherein t is the lifetime of a video frame in milliseconds, α is the portion of a video
frame which contains a carrier frequency (value between 0 and 1), Fps is the frame
rate of said video in frames per second (fps) and τ is the time interval between each
55
two frames containing VLC signals in milliseconds. These relations are illustrated in
Figure 5.5.
Frame N
αt... ...
τt
Frame (M+N)/2 Frame M Frame M+1
Figure 5.5: Sequence of frames with maximum one VLC frame
Depending on the defined bit rate of the transmission, the lifetime of a carrier fre-
quency can be as short as a small portion of the total lifetime of a frame (e.g. one
sixth of a frame (α)). The lifetime of a video frame (t) in a common 30fps (Fps) re-
cording device is near 33 milliseconds. This makes the lifetime of our example carrier
frequency roughly 5.5 milliseconds. By setting the VLC transmission intervals (τ ) as
e.g. every 5 seconds, a video stream with a playtime of 5 seconds may contain only 5.5
milliseconds worth of VLC data. This short portion of data transmission is too small to
be noticed in a live playback and does not constitute more than 0.0036% of the whole
playtime.
In addition, before the final broadcasting action takes place, the so-called distorted
video frame can be dropped or replaced with its neighbouring frames. This method
does not impose noticeable changes in the playback of high frame rate videos and will
result in a distortion-free video sequence. However, this process should take place
after the video frames are decoded and encoded. Therefore, extra care should be taken
into account for the introduced delay to keep the real-time requirements satisfied. As
it is shown in Figure 4.15, one way to accelerate the broadcasting is to transmit the
unpacked video frames to the broadcaster before the video decoding and frame pro-
cessing take place. However, as long as the processing time for each frame (i.e. 25ms
shown in Table 5.2) does not exceed the amount of time equivalent to the lifetime of a
frame (i.e. 33ms for a 30fps video), the playback will operate smoothly.
56
5.5 Summary
In this chapter we explained how the implemented system can be evaluated in an isol-
ated environment. In addition, the proof-of-concept test results were presented in this
chapter. Moreover, the benchmarking results were presented with accelerations gained
by using GPU. Finally, quality measurement and improvement techniques were ex-
plained. Next chapter will summarize the conclusion, findings and future work.
57
6 CONCLUSION AND FUTURE WORK
This chapter presents the conclusions based on the performed work of this thesis in
Section 6.1, followed by a presentation of future work possibilities to improve and
advance the implemented VLC system in Section 6.2. Further discussions are also
presented at the end of this chapter in Section 6.3.
6.1 Conclusion
This thesis presents the implementation of a camera-based VLC system that can be
used for real-time applications in a cloud environment. Furthermore, the implemen-
ted VLC system was used to provide inter-frame synchronization of multiple video
streams. As explained in Chapter 2 earlier, one of the many use cases of video syn-
chronization is where a broadcasting director wishes to be able to view a number of
video streams in a synchronized manner.
The challenges and limitations in designing and implementing this system were
also presented. The given results in Chapter 5 revealed the sufficient accuracy and
agility of the system for a real-time cloud-based application. At the moment of writing
this thesis the final version of the implemented VLC system is able to support up to
150 bits per second (i.e. 5 bits per frame on mobile devices).
However, the concept of VLC is still emerging. Therefore, there is a need to per-
form more research on this topic. For example, this thesis discovered that communic-
ation through visible light and digital cameras is not immune to the impact of video
compression and motion estimation techniques. This phenomenon, however, needs to
be investigated and studied more. Moreover, the trade-off between flicker improve-
ment techniques and system accuracy needs more investigation for quality of experi-
ence purposes.
58
6.2 Future work
This section presents the further research ideas that can assist improving the existing
implemented camera-based VLC system.
Increasing The Bitrate
In this work the lighting sources were blinking with a unique frequency at a time.
However, one way to increase the bitrate of this communication is to transmit sev-
eral frequencies at a time. This can be achieved by individualizing the light sources
and dedicating each light source to a different frequency. Nevertheless, this method
requires the emitting light of all light sources to be in line of sight of the camera.
Another suggestion is to merge different frequencies into one blinking pattern and
make the light sources blink according to that pattern. Figure 6.1 shows how two differ-
ent sinusoid frequencies are added together. In Vision Science these spatial frequencies
(a.k.a. sinusoidal gratings) are used as visual stimuli [135] [136]. In Figure 6.1 each
horizontal section represents a single frequency while the third frequency is the result
of adding the first two frequencies. The right part of the picture illustrates the signals
in square pulses and the left part of the picture illustrates the equivalent seen on the
rolling shutter sensor. In this example adding frequencies means performing a logical
AND operator on the state of the square pulses at each time. However, in practice this
operation could be e.g. NOT AND. In addition, a different modulation technique (such
as FDM) could be more beneficial while applying this method [137].
59
F1
F2
F1
+
F2
Signals captured by rolling shutter Pulsed signals
Figure 6.1: Adding sinusoid frequencies
Both of these methods require more research on how these blinking frequencies
get merged together in the physical layer and how the camera would react and capture
the dark/bright bands. Moreover, the detection mechanisms (e.g. DFT) should also be
studied more.
Narrowing The lifetime of Frequencies
The suggested method of increasing the lifetime of frequencies presented in Sec-
tion 4.2.4 has the drawback of consuming spatial bandwidth. Researchers in [73]
suggest a method in which consequent frames are stitched together into a long single
image and then a sliding window (with size of one frame) is slid across the image
performing a FFT conversion for each step.
Although this method could solve the problem of frame discontinuity, the com-
putational complexity and memory requirements might not let the system to meet its
real-time requirements. Depending on the movement of the sliding window, the num-
ber of required FFT calls increases. Figure 6.2 illustrates a situation where two frames
are stitched together and a window is sliding for half of a frame in each step. Note that
for every n the nth frame has to be stitched to the following n+ 1th frame.
60
nth frame
(n+1)th frame
f2
lifetime λ
Window
f1
lifetime λ
Slide 1
detection: none
Slide 2
detection: f1, f2
Slide 3
detection: f2
Figure 6.2: Stitching images - Sliding window.
The trade-off between bitrate and computational complexity in this method needs
more investigation. Moreover, new optimization techniques could be exploited to over-
come the problem of high computational complexity. For example, a new technique
could be a combination of our method and the stitching method presented in [73] where
only a small portion of top/bottom borders of frames are stitched together and checked
for frequency discontinuation.
CPU/GPU Cooperation
In Chapter 5.3, we explained how task migration to GPU can accelerate the perform-
ance of the computational tasks in the VLC receiver. In addition, the CPU/GPU co-
operation can provide pipelining possibilities that can speed up the heavy computa-
tional tasks. With the DFT operations detached from CPU to GPU, the CPU can per-
form other tasks in parallel such as fetching and decoding the next video frame from
buffer. The task scheduling schematic shown in Figure 6.3a represents this scenario.
61
Figure 6.3b illustrates the same tasks being executed on CPU only.
Time
CPU
Video buffer
... ...
α(n)
GPUλ(n)
α(n+1) Ω(n) α(n+1)
λ(n+1)
α(n+2) Ω(n+1) α(n+2)
λ(n+2)
(a) Task scheduling in pipelined CPU-GPU cooperation
Time
CPU
Video buffer
... ...
α(n) λ(n) Ω(n) α(n+1) λ(n+1) Ω(n+1)
(b) Task scheduling using only CPU
Figure 6.3: Processing times in pipelined and non-pipelined task scheduling
In Figure 6.3 α, λ and Ω represent the pre-DFT processes, the DFT process and
the post-DFT processes respectively. The letter n indicates the nth video frame. For
example, λ(1) refers to the DFT process that belongs to the first video frame. As
explained in Chapter 4.3 pre-DFT processes include threading operations, video de-
coding, frame conversion, etc. and post-DFT processes include message decoding and
necessary computations on the result of DFT. In Figure 6.3 the computation time for
DFT operations on both CPU and GPU is assumed to be the same. The context switch-
ing times are also neglected.
6.3 Discussion
Visible Light Communication similar to any other new technology can provide endless
possibilities. However, the on-going research around this field does not only limit to
further development of the technology. Without enough consideration and research,
the advancement of a technology can end up endangering our health and other species
62
and/or technology obsolescence. Therefore it is necessary that our research include all
aspects of an evolving technology including its by-products.
One problem of advancement of outdoor lightening is the concept of light pollution.
Light pollution is an environmental degradation which refers to the unnecessary or
misdirected light [138]. The recent rise in popularity of LED lightening has (again)
raised a lot of awareness among researchers concerning the adverse effects of light
pollution and artificial lightening. The adverse consequences of light pollution are not
limited to:
(1) Affecting lifestyle of nocturnal and non-nocturnal species.
Inappropriate lighting conditions can affect the health condition and lifestyle of
many species specially mammals and nocturnal animals [139] [140]. Although it
is believed that the development of LED lighting can be helpful in reducing the
side effects of light pollution on animals, this concept still remains controversial.
For example, Researches in [141] show that although using LED light sources has
less effects on bat species compared to older technologies, there are still existing
ecological impacts imposed by the so called green lighting technologies that needs
to be considered in further research.
(2) Affecting the human psyche.
In Chapter 3 of this thesis some health related benefits of using communication
over light was presented. The fact that visible light does not have the drawbacks
of other radio waves makes it interesting for many applications. However, recent
researches have drawn a lot of attention on the side effects of artificial light on
human psyche [142]. These side effects can be for example changes of melatonin
levels [143] that can affect the sleeping routines or even lead to cancer [144].
(3) Skyglow.
Skyglow is the illumination of the night sky when artificial light is scattered by at-
mospheric molecules or aerosols and returned to Earth [145]. The negative effects
of Skyglow are for example stellar visibility reduction (to a extend that more than
one half of the EU population have already lost naked eye visibility of the Milky
Way) [146], affecting bird’s orientation in migration [147] and many other impacts
on biological and ecological systems [148].
63
The transition from traditional lighting to LED based lighting is not believed to in-
crease nor decrease the Skyglow effect per se, but the computerized controllability
of LED based systems in direction of emission, brightness and color can help the
reduction of Skyglow effect [149].
Nonetheless, many researchers were focusing on reduction and prevention of these
adverse effect by suggesting the following methods [150] [151]:
(1) Shielding and redesigning
(2) Limiting coverage area
(3) Dimming, shortening (time limitation) and even shutting down
(4) Growth limitation
(5) Spectrum shifting
It is inevitable to think that drastic change in design and rethinking the lighting
systems will have an impact on the development, advancement and applications of
indoor/outdoor VLC systems. Therefore it is needed to perform more research on
related topics [152].
64
BIBLIOGRAPHY
[1] Longfei Wu, Xiaojiang Du, and Xinwen Fu. Security threats to mobile multi-media applications: Camera-based attacks on mobile phones. Communications
Magazine, IEEE, 52(3):80–87, 2014.
[2] K. Breitman, M. Endler, R. Pereira, and M. Azambuja. When tv dies, will itgo to the cloud? Computer, 43(4):81–83, April 2010. doi:10.1109/MC.2010.118.
[3] Sylwia Kechiche. Cellular m2m forecasts: unlocking growth. Technical report,GSMA Intelligence, February 2015.
[4] Cisco. Cisco visual networking index: Global mobile data traffic forecast up-date, 2014 – 2019. Technical report, Cisco, February 2015.
[5] Ming-Jiang Yang, Jo Yew Tham, Dajun Wu, and Kwong Huang Goh. Cost ef-fective ip camera for video surveillance. In Industrial Electronics and Applic-
ations, 2009. ICIEA 2009. 4th IEEE Conference on, pages 2432–2435. IEEE,2009.
[6] K. Jeffay, D.L. Stone, T. Talley, and F.D. Smith. Adaptive, best-effort deliveryof digital audio and video across packet-switched networks. In P. Venkat Ran-gan, editor, Network and Operating System Support for Digital Audio and
Video, volume 712 of Lecture Notes in Computer Science, pages 1–14. SpringerBerlin Heidelberg, 1993. URL: http://dx.doi.org/10.1007/3-540-57183-3_1,doi:10.1007/3-540-57183-3_1.
[7] Yung-Chih Chen and Don Towsley. On bufferbloat and delay analysis of mul-tipath tcp in wireless networks.
[8] Yung-Chih Chen, Don Towsley, Erich M Nahum, Richard J Gibbens, and Yeon-sup Lim. Characterizing 4g and 3g networks: Supporting mobility with mul-tipath tcp. School of Computer Science, University of Massachusetts Amherst,
Tech. Rep, 22, 2012.
[9] Eralda Caushaj, Ivan Ivanov, Huirong Fu, Ishwar Sethi, and Ye Zhu. Evaluatingthroughput and delay in 3g and 4g mobile architectures. Journal of Computer
and Communications, 2014, 2014.
65
[10] Dmitry Pundik and Yael Moses. Video synchronization using temporalsignals from epipolar lines. In Kostas Daniilidis, Petros Maragos, andNikos Paragios, editors, Computer Vision – ECCV 2010, volume 6313 ofLecture Notes in Computer Science, pages 15–28. Springer Berlin Heidel-berg, 2010. URL: http://dx.doi.org/10.1007/978-3-642-15558-1_2, doi:10.1007/978-3-642-15558-1_2.
[11] Sridhar Rajagopal, Richard D Roberts, and Sang-Kyu Lim. Ieee 802.15. 7 vis-ible light communication: modulation schemes and dimming support. Commu-
nications Magazine, IEEE, 50(3):72–82, 2012.
[12] Hyun-Seung Kim, Deok-Rae Kim, Se-Hoon Yang, Yong-Hwan Son, and Sang-Kook Han. An indoor visible light communication positioning system using a rfcarrier allocation technique. Lightwave Technology, Journal of, 31(1):134–144,2013.
[13] Felix Schill, Uwe R Zimmer, and Jochen Trumpf. Visible spectrum optical com-munication and distance sensing for underwater applications. In Proceedings
of ACRA, volume 2004, pages 1–8, 2004.
[14] Soo-Yong Jung, Swook Hann, and Chang-Soo Park. Tdoa-based optical wire-less indoor localization using led ceiling lamps. Consumer Electronics, IEEE
Transactions on, 57(4):1592–1597, 2011.
[15] N Khan and N Abas. Comparative study of energy saving light sources. Re-
newable and Sustainable Energy Reviews, 15(1):296–309, 2011.
[16] Siddha Pimputkar, James S Speck, Steven P DenBaars, and Shuji Nakamura.Prospects for led lighting. Nature Photonics, 3(4):180–182, 2009.
[17] Maury Wright. Us government accelerates led street lightpush in doe program. LEDs Magazine, Online Articles,2015. URL: http://www.ledsmagazine.com/articles/2015/01/us-government-accelerates-led-street-light-push-in-doe-program.html.
[18] S. Haruyama. Progress of visible light communication. In Optical Fiber
Communication (OFC), collocated National Fiber Optic Engineers Conference,
2010 Conference on (OFC/NFOEC), pages 1–3, March 2010.
[19] Aníbal De Almeida, Bruno Santos, Bertoldi Paolo, and Michel Quicheron. Solidstate lighting review – potential and challenges in europe. Renewable and
Sustainable Energy Reviews, 34(0):30 – 48, 2014.
[20] Roland Haitz and Jeffrey Y Tsao. Solid-state lighting:‘the case’10 years afterand future prospects. physica status solidi (a), 208(1):17–29, 2011.
66
[21] E Fred Schubert and Jong Kyu Kim. Solid-state light sources getting smart.Science, 308(5726):1274–1278, 2005.
[22] Jurgen Hase. Intelligent lighting paves the way for the smart city. LEDs
Magazine, 73, 2014.
[23] M. Castro, A.J. Jara, and A.F.G. Skarmeta. Smart lighting solutions for smartcities. In Advanced Information Networking and Applications Workshops
(WAINA), 2013 27th International Conference on, pages 1374–1379, March2013. doi:10.1109/WAINA.2013.254.
[24] Walter Karlen, Joanne Lim, J Mark Ansermino, Guy Dumont, and Cornie Schef-fer. Design challenges for camera oximetry on a mobile phone. In Engineering
in Medicine and Biology Society (EMBC), 2012 Annual International Confer-
ence of the IEEE, pages 2448–2451. IEEE, 2012.
[25] Tim Hayes. Next-generation cell phone cameras. Optics and Photonics News,23(2):16–21, 2012.
[26] Damien Igoe, Alfio Parisi, and Brad Carter. Characterization of a smartphonecamera’s response to ultraviolet a radiation. Photochemistry and Photobiology,89(1):215–218, 2013. doi:10.1111/j.1751-1097.2012.01216.x.
[27] Zabih Ghassemlooy, Wasiu Popoola, and Sujan Rajbhandari. Optical wireless
communications: system and channel modelling with Matlab®. CRC Press,2012.
[28] Jingfeng Zhang, Ying Li, and Yanna Wei. Using timestamp to realize audio-video synchronization in real-time streaming media transmission. In Audio,
Language and Image Processing, 2008. ICALIP 2008. International Conference
on, pages 1073–1076. IEEE, 2008.
[29] Robert T Collins, Omead Amidi, and Takeo Kanade. An active camera systemfor acquiring multi-view video. In ICIP (1), pages 527–520, 2002.
[30] Marc Pollefeys, Sudipta N. Sinha, Li Guan, and Jean-Sébastien Franco. Chapter2 - multi-view calibration, synchronization, and dynamic scene reconstruction.In Hamid AghajanAndrea Cavallaro, editor, Multi-Camera Networks, pages29 – 75. Academic Press, Oxford, 2009. URL: http://www.sciencedirect.com/science/article/pii/B9780123746337000045, doi:http://dx.doi.org/
10.1016/B978-0-12-374633-7.00004-5.
[31] A. Whitehead, R. Laganiere, and P. Bose. Temporal synchronization of videosequences in theory and in practice. In Application of Computer Vision, 2005.
WACV/MOTIONS ’05 Volume 1. Seventh IEEE Workshops on, volume 2, pages132–137, Jan 2005. doi:10.1109/ACVMOT.2005.114.
67
[32] Colin Perkins. Rtp: Audio and Video for the Internet. Addison-Wesley Profes-sional, first edition, 2003.
[33] D. Mills. Network time protocol (version 3) specification, implementation.Technical report, United States, 1992.
[34] T. Tuytelaars and L. Van Gool. Synchronizing video sequences. In Computer
Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004
IEEE Computer Society Conference on, volume 1, pages I–762–I–768 Vol.1,June 2004. doi:10.1109/CVPR.2004.1315108.
[35] Yaron Caspi and Michal Irani. A step towards sequence-to-sequence alignment.In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Confer-
ence on, volume 2, pages 682–689. IEEE, 2000.
[36] Philip A Tresadern and Ian Reid. Synchronizing image sequences of non-rigidobjects. In BMVC, pages 1–10, 2003.
[37] Lisa Spencer and Mubarak Shah. Temporal synchronization from camera mo-tion. In Proceedings of Asian Conference on Computer Vision, pages 515–520,2004.
[38] Cheng Lei and Yee-Hong Yang. Tri-focal tensor-based multiple video synchron-ization with subframe optimization. Image Processing, IEEE Transactions on,15(9):2473–2480, Sept 2006. doi:10.1109/TIP.2006.877438.
[39] L. Lee, R. Romano, and G. Stein. Monitoring activities from multiple videostreams: establishing a common coordinate frame. Pattern Analysis and Ma-
chine Intelligence, IEEE Transactions on, 22(8):758–767, Aug 2000. doi:
10.1109/34.868678.
[40] Jingyu Yan and Marc Pollefeys. Video synchronization via space-time interestpoint distribution. In Advanced Concepts for Intelligent Vision Systems, pages501–504, 2004.
[41] Lior Wolf and Assaf Zomet. Correspondence-free synchronization and recon-struction in a non-rigid scene. In Proc. Workshop on Vision and Modelling of
Dynamic Scenes, Copenhagen, 2002.
[42] Prarthana Shrstha, Mauro Barbieri, and Hans Weda. Synchronization of multi-camera video recordings based on audio. In Proceedings of the 15th interna-
tional conference on Multimedia, pages 545–548. ACM, 2007.
[43] Ken Goldberg, Camille Crittenden, Abram Stern, and John Scott. The rashomonproject, Jan 2015. URL: http://rieff.ieor.berkeley.edu/rashomon/.
[44] Shlomi Arnon. Visible light communication. Cambridge University Press, 2015.
68
[45] Dobroslav Tsonev, Hyunchae Chun, Sujan Rajbhandari, Jonathan JDMcKendry, Stefan Videv, Erdan Gu, Mohsin Haji, Scott Watson, Anthony EKelly, Grahame Faulkner, et al. A 3-gb/s single-led ofdm-based wireless vlclink using a gallium nitride. Photonics Technology Letters, IEEE, 26(7):637–640, 2014.
[46] S. Okada, T. Yendo, T. Yamazato, T. Fujii, M. Tanimoto, and Y. Kimura.On-vehicle receiver for distant visible light road-to-vehicle communication.In Intelligent Vehicles Symposium, 2009 IEEE, pages 1033–1038, June 2009.doi:10.1109/IVS.2009.5164423.
[47] Navina Kumar, Nuno Lourenco, Michal Spiez, and Rui L Aguiar. Visiblelight communication systems conception and vidas. IETE Technical Review,25(6):359–367, 2008.
[48] Woo-Chan Kim, Chi-Sung Bae, Soo-Yong Jeon, Sung-Yeop Pyun, and Dong-Ho Cho. Efficient resource allocation for rapid link recovery and visibility invisible-light local area networks. Consumer Electronics, IEEE Transactions
on, 56(2):524–531, 2010.
[49] T. Komine and M. Nakagawa. Fundamental analysis for visible-light commu-nication system using led lights. Consumer Electronics, IEEE Transactions on,50(1):100–107, Feb 2004. doi:10.1109/TCE.2004.1277847.
[50] Jong Kyu Kim and E Fred Schubert. Transcending the replacement paradigmof solid-state lighting. Optics Express, 16(26):21835–21842, 2008.
[51] Ashwin Ashok, Marco Gruteser, Narayan Mandayam, Jayant Silva, MichaelVarga, and Kristin Dana. Challenge: Mobile optical networks through visualmimo. In Proceedings of the sixteenth annual international conference on Mo-
bile computing and networking, pages 105–112. ACM, 2010.
[52] Silvano Donati. Photodetectors. Prentice Hall PTR, 1999.
[53] Sanka Gateva. Photodetectors. InTech, 2012.
[54] Masao Nakagwa and Shinichiro Haruyama. Camera-equipped cellular terminalfor visible light communication, February 1 2005. US Patent App. 10/588,009.
[55] Paul Dietz, William Yerazunis, and Darren Leigh. Very low-cost sensing andcommunication using bidirectional leds. In UbiComp 2003: Ubiquitous Com-
puting, pages 175–191. Springer, 2003.
[56] Stefan Schmid, Giorgio Corbellini, Stefan Mangold, and Thomas R Gross. Led-to-led visible light communication networks. In Proceedings of the fourteenth
ACM international symposium on Mobile ad hoc networking and computing,pages 1–10. ACM, 2013.
69
[57] Aleksandar Jovicic, Junyi Li, and Tom Richardson. Visible light communic-ation: Opportunities, challenges and the path to market. Communications
Magazine, IEEE, 51(12):26–32, 2013.
[58] Michael B Rahaim, Anna Maria Vegni, and Thomas DC Little. A hybrid radiofrequency and broadcast visible light communication system. In GLOBECOM
Workshops (GC Wkshps), 2011 IEEE, pages 792–796. IEEE, 2011.
[59] Hany Elgala, Raed Mesleh, and Harald Haas. Indoor optical wireless com-munication: potential and state-of-the-art. Communications Magazine, IEEE,49(9):56–62, 2011.
[60] Ieee standard for local and metropolitan area networks–part 15.7: Short-rangewireless optical communication using visible light. IEEE Std 802.15.7-2011,pages 1–309, Sept 2011. doi:10.1109/IEEESTD.2011.6016195.
[61] Yuanquan Wang, Yiguang Wang, Nan Chi, Jianjun Yu, and Huiliang Shang.Demonstration of 575-mb/s downlink and 225-mb/s uplink bi-directional scm-wdm visible light communication using rgb led and phosphor-based led. Optics
express, 21(1):1203–1208, 2013.
[62] Yuanquan Wang, Yufeng Shao, Huiliang Shang, Xiaoyuan Lu, Yiguang Wang,Jianjun Yu, and Nan Chi. 875-mb/s asynchronous bi-directional 64qam-ofdmscm-wdm transmission over rgb-led-based visible light communication system.In Optical Fiber Communication Conference, pages OTh1G–3. Optical Societyof America, 2013.
[63] Shinya Iwasaki, Chinthaka Premachandra, Tomohiro Endo, Toshiaki Fujii,Masayuki Tanimoto, and Yoshikatsu Kimura. Visible light road-to-vehicle com-munication using high-speed camera. In Intelligent Vehicles Symposium, 2008
IEEE, pages 13–18. IEEE, 2008.
[64] Halpage Chinthaka Nuwandika Premachandra, Tomohiro Yendo, Mehrdad Pa-nahpour Tehrani, Takaya Yamazato, Hiraku Okada, Toshiaki Fujii, and Masay-uki Tanimoto. High-speed-camera image processing based led traffic light de-tection for road-to-vehicle visible light communication. In Intelligent Vehicles
Symposium (IV), 2010 IEEE, pages 793–798. IEEE, 2010.
[65] Toru Nagura, Takaya Yamazato, Masaaki Katayama, Tomohiro Yendo, ToshiakiFujii, and Hiraku Okada. Improved decoding methods of visible light commu-nication system for its using led array and high-speed camera. In Vehicular
Technology Conference (VTC 2010-Spring), 2010 IEEE 71st, pages 1–5. IEEE,2010.
[66] Albert J.P. Theuwissen. CMOS image sensors: State-of-the-art. Solid-State
Electronics, 52(9):1401 – 1406, 2008. Papers Selected from the 37th European
70
Solid-State Device Research Conference - ESSDERC’07. URL: http://www.sciencedirect.com/science/article/pii/S0038110108001317, doi:http://dx.doi.org/10.1016/j.sse.2008.04.012.
[67] Heinz Helmers and Markus Schellenberg. Cmos vs. ccd sensors in speckleinterferometry. Optics & Laser Technology, 35(8):587–595, 2003.
[68] Marci Meingast, Christopher Geyer, and Shankar Sastry. Geometric models ofrolling-shutter cameras. arXiv preprint cs/0503076, 2005.
[69] Chia-Kai Liang, Li-Wen Chang, and H.H. Chen. Analysis and compensation ofrolling shutter effect. Image Processing, IEEE Transactions on, 17(8):1323–1330, Aug 2008. doi:10.1109/TIP.2008.925384.
[70] Dave Litwiller. Ccd vs. cmos. Photonics Spectra, 35(1):154–158, 2001.
[71] Omar Ait-Aider, Nicolas Andreff, Jean Marc Lavest, and Philippe Martinet.Simultaneous object pose and velocity computation using a single view froma rolling shutter camera. In Computer Vision–ECCV 2006, pages 56–68.Springer, 2006.
[72] P-E Forssén and Erik Ringaby. Rectifying rolling shutter video from hand-helddevices. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE
Conference on, pages 507–514. IEEE, 2010.
[73] Niranjini Rajagopal, Patrick Lazik, and Anthony Rowe. Visual light landmarksfor mobile devices. In Proceedings of the 13th international symposium on
Information processing in sensor networks, pages 249–260. IEEE Press, 2014.
[74] Niranjini Rajagopal, Patrick Lazik, and Anthony Rowe. Demonstration abstract:How many lights do you see? In Information Processing in Sensor Networks,
IPSN-14 Proceedings of the 13th International Symposium on, pages 347–348.IEEE, 2014.
[75] Ye-Sheng Kuo, Pat Pannuto, Ko-Jen Hsiao, and Prabal Dutta. Luxapose: Indoorpositioning with mobile phones and visible light. In Proceedings of the 20th
annual international conference on Mobile computing and networking, pages447–458. ACM, 2014.
[76] Yoshinori Matsumoto, Takaharu Hara, and Yohsuke Kimura. Cmos photo-transistor array detection system for visual light identification (id). In Net-
worked Sensing Systems, 2008. INSS 2008. 5th International Conference on,pages 99–102. IEEE, 2008.
71
[77] Robert LiKamWa, David Ramirez, and Jason Holloway. Styrofoam: a tightlypacked coding scheme for camera-based visible light communication. In Pro-
ceedings of the 1st ACM MobiCom workshop on Visible light communication
systems, pages 27–32. ACM, 2014.
[78] Samuel David Perli, Nabeel Ahmed, and Dina Katabi. Pixnet: interference-freewireless links using lcd-camera pairs. In Proceedings of the sixteenth annual
international conference on Mobile computing and networking, pages 137–148.ACM, 2010.
[79] A. Bingham and D. Spradlin. The Open Innovation Marketplace: Creating
Value in the Challenge Driven Enterprise. Pearson Education, 2011.
[80] Joel West and Scott Gallagher. Challenges of open innovation: the paradoxof firm investment in open-source software. R&D Management, 36(3):319–331, 2006. URL: http://dx.doi.org/10.1111/j.1467-9310.2006.00436.x, doi:10.1111/j.1467-9310.2006.00436.x.
[81] Ffmpeg project. Ffmpeg, Jan 2015. URL: http://www.ffmpegs.org/.
[82] G. Bradski. The opencv library. Dr. Dobb’s Journal of Software Tools, 2000.
[83] Massimo Banzi. Getting Started with Arduino. Make Books - Imprint of:O’Reilly Media, Sebastopol, CA, ill edition, 2008.
[84] Kaiyun Cui, Gang Chen, Zhengyuan Xu, and Richard D Roberts. Line-of-sightvisible light communication system design and demonstration. In Commu-
nication Systems Networks and Digital Signal Processing (CSNDSP), 2010 7th
International Symposium on, pages 621–625. IEEE, 2010.
[85] F.F. Mazda. Electronics Engineer’s Reference Book. Elsevier Science, 2013.
[86] R.F. Pierret. Semiconductor Device Fundamentals. Addison-Wesley, 1996.
[87] G.R. Jones. Electrical Engineer’s Reference Book. Elsevier Science, 2013.
[88] STMicroelectronics. Complementary power Darlington transistors, 10 2008.Rev. 4.
[89] Rudolf F Graf and William Sheets. Encyclopedia of Electronic Circuits, Vol. 4,volume 4. Granite Hill Publishers, 1992.
[90] Kwok K. Ng. Phototransistor, pages 462–469. John Wiley & Sons, Inc.,2009. URL: http://dx.doi.org/10.1002/9781118014769.ch59, doi:10.1002/9781118014769.ch59.
[91] CNY17 Series. Optocoupler with phototransistor output. Vishay Telefunken,1999.
72
[92] Atmel, http://www.atmel.com/Images/doc2503.pdf. Atmel ATMega32 micro-
controller datasheet, February 2011.
[93] Alessandro D’Ausilio. Arduino: A low-cost multipurpose lab equipment. Be-
havior research methods, 44(2):305–313, 2012.
[94] Atmel, http://www.atmel.com/Images/doc7799.pdf. Atmel ATMega16U2 mi-
crocontroller datasheet, September 2012.
[95] Steve Winder. Power supplies for LED driving. Newnes, 2011.
[96] M Saadi, L Wattisuttikulkij, Y Zhao, and P Sangwongngam. Visible lightcommunication: opportunities, challenges and channel models. International
Journal of Electronics & Informatics, 2(1):1–11, 2013.
[97] Bo Bai, Zhengyuan Xu, and Yangyu Fan. Joint led dimming and high capacityvisible light communication by overlapping ppm. In Wireless and Optical Com-
munications Conference (WOCC), 2010 19th Annual, pages 1–5. IEEE, 2010.
[98] O. Bouchet. Wireless Optical Telecommunications. ISTE. Wiley, 2013. URL:https://books.google.fi/books?id=HBFSt4O64VgC.
[99] Shin-Yi Chang, Jo-Ping Li, Hua-Min Tseng, and Pai H Chou. Greendicator: En-abling optical pulse-encoded data output from wsn for display on smartphones.
[100] Ubolthip Sethakaset and T. Aaron Gulliver. Differential amplitude pulse-position modulation for indoor wireless optical communications. EURASIP
J. Wirel. Commun. Netw., 2005(1):3–11, March 2005. URL: http://dx.doi.org/10.1155/WCN.2005.3, doi:10.1155/WCN.2005.3.
[101] Xiao Zhang, Svilen Dimitrov, Sinan Sinanovic, and Harald Haas. Optimalpower allocation in spatial modulation ofdm for visible light communications.In Vehicular Technology Conference (VTC Spring), 2012 IEEE 75th, pages 1–5.IEEE, 2012.
[102] R.D. Roberts. Undersampled frequency shift on-off keying (ufsook) for cam-era communications (camcom). In Wireless and Optical Communication Con-
ference (WOCC), 2013 22nd, pages 645–648, May 2013. doi:10.1109/
WOCC.2013.6676454.
[103] JE Farrell, Brian L Benson, and Carl R Haynie. Predicting flicker thresholds forvideo display terminals. In Proc SID, volume 28, pages 449–453, 1987.
[104] DH Kelly. Diffusion model of linear flicker responses. JOSA, 59(12):1665–1670, 1969.
73
[105] D.H. Kelly. Sine waves and flicker fusion. Documenta Ophthalmologica,18(1):16–35, 1964. URL: http://dx.doi.org/10.1007/BF00160561, doi:10.1007/BF00160561.
[106] Barry B. Lee, Joel Pokorny, Paul R. Martin, Arne Valbergt, and Vivianne C.Smith. Luminance and chromatic modulation sensitivity of macaque gan-glion cells and human observers. J. Opt. Soc. Am. A, 7(12):2223–2236, Dec1990. URL: http://josaa.osa.org/abstract.cfm?URI=josaa-7-12-2223, doi:
10.1364/JOSAA.7.002223.
[107] Samuel Sokol and Lorrin A Riggs. Electrical and psychophysical responsesof the human visual system to periodic variation of luminance. Investigative
Ophthalmology & Visual Science, 10(3):171–180, 1971.
[108] T. Keppler, N. Watson, and J. Arrillaga. Computation of the short-term flickerseverity index. Power Delivery, IEEE Transactions on, 15(4):1110–1115, Oct2000. doi:10.1109/61.891490.
[109] SAMUEL M BERMAN, DANIEL S GREENHOUSE, IAN L BAILEY,ROBERT D CLEAR, and THOMAS W RAASCH. Human electroretinogramresponses to video displays, fluorescent lighting, and other high frequencysources. Optometry & Vision Science, 68(8):645–662, 1991.
[110] I.E. Richardson. The H.264 Advanced Video Compression Standard. Wiley,2011. URL: http://www.google.fi/books?id=k7nOAiIUo9IC.
[111] Tzi-Dar Chiueh, Pei-Yun Tsai, and I-Wei Lai. Baseband Receiver Design for
Wireless MIMO-OFDM Communications. John Wiley & Sons, 2012.
[112] Digital Signal Processing. Laxmi Publications Pvt Ltd, 2007.
[113] Jack D Gaskill. Linear systems, fourier transforms, and optics. 1978.
[114] SJ Ranade and W Xu. An overview of harmonics modeling and simulation.IEEE Task Force on Harmonics Modeling and Simulation, page 1, 2007.
[115] Iaroslav V Blagouchine and Eric Moreau. Analytic method for the computationof the total harmonic distortion by the cauchy method of residues. Communic-
ations, IEEE Transactions on, 59(9):2478–2491, 2011.
[116] L Svilainis. Led pwm dimming linearity investigation. Displays, 29(3):243–249, 2008.
[117] Prathyusha Narra and Donald S Zinger. An effective led dimming approach. InIndustry Applications Conference, 2004. 39th IAS Annual Meeting. Conference
Record of the 2004 IEEE, volume 3, pages 1671–1676. IEEE, 2004.
74
[118] M Anand and Prasoon Mishra. A novel modulation scheme for visible lightcommunication. In India Conference (INDICON), 2010 Annual IEEE, pages1–3. IEEE, 2010.
[119] Richard D Roberts. Space-time forward error correction for dimmable under-sampled frequency shift on-off keying camera communications (camcom). InUbiquitous and Future Networks (ICUFN), 2013 Fifth International Conference
on, pages 459–464. IEEE, 2013.
[120] Christos Danakis, Mostafa Afgani, Gordon Povey, Ian Underwood, and HaraldHaas. Using a cmos camera sensor for visible light communication. In Globe-
com Workshops (GC Wkshps), 2012 IEEE, pages 1244–1248. IEEE, 2012.
[121] The discrete fourier transform in 2d. In Digital Image Pro-
cessing, Texts in Computer Science, pages 343–366. Springer London,2008. URL: http://dx.doi.org/10.1007/978-1-84628-968-2_14, doi:10.
1007/978-1-84628-968-2_14.
[122] C. Solomon and T. Breckon. Fundamentals of Digital Image Processing: A
Practical Approach with Examples in Matlab. Wiley, 2011. URL: https://books.google.fi/books?id=NoJ15jLdy7YC.
[123] Amir Averbuch, Ronald R Coifman, David L Donoho, Michael Elad, and MosheIsraeli. Fast and accurate polar fourier transform. Applied and computational
harmonic analysis, 21(2):145–167, 2006.
[124] Pierre Duhamel and Martin Vetterli. Fast fourier transforms: a tutorial reviewand a state of the art. Signal processing, 19(4):259–299, 1990.
[125] G. Bradski and A. Kaehler. Learning OpenCV: Computer Vision with the
OpenCV Library. O’Reilly Media, 2008. URL: https://books.google.fi/books?id=seAgiOfu2EIC.
[126] Paolo Prandoni and Martin Vetterli. Signal processing for communications.CRC Press, 2008.
[127] S. Qureshi. Embedded Image Processing on the TMS320C6000TM DSP: Ex-
amples in Code Composer StudioTM and MATLAB. Springer, 2005. URL:https://books.google.fi/books?id=w3BZ0PrmqtkC.
[128] YoungChoon Lee and AlbertY. Zomaya. Energy efficient utilization of resourcesin cloud computing systems. The Journal of Supercomputing, 60(2):268–280,2012. URL: http://dx.doi.org/10.1007/s11227-010-0421-3, doi:10.1007/s11227-010-0421-3.
75
[129] Shane Ryoo, Christopher I Rodrigues, Sara S Baghsorkhi, Sam S Stone,David B Kirk, and Wen-mei W Hwu. Optimization principles and applica-tion performance evaluation of a multithreaded gpu using cuda. In Proceedings
of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel
programming, pages 73–82. ACM, 2008.
[130] Thilaka Sumanaweera and Donald Liu. Medical image reconstruction with thefft. GPU gems, 2:765–784, 2005.
[131] Yasuhiko Ogata, Toshio Endo, Naoya Maruyama, and Satoshi Matsuoka. An ef-ficient, model-based cpu-gpu heterogeneous fft library. In Parallel and Distrib-
uted Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages1–10. IEEE, 2008.
[132] Hany Elgala, Raed Mesleh, Harald Haas, and Bogdan Pricope. Ofdm visiblelight wireless communication based on white leds. In Vehicular Technology
Conference, 2007. VTC2007-Spring. IEEE 65th, pages 2185–2189. IEEE, 2007.
[133] Sunghwan Kim and Sung-Yoon Jung. Novel fec coding scheme for dimmablevisible light communication based on the modified reed-muller codes. Photon-
ics Technology Letters, IEEE, 23(20):1514–1516, Oct 2011.
[134] D.R. Smith. Digital Transmission Systems. Springer, 1993.
[135] Sang-Hun Lee and Randolph Blake. Detection of temporal structure dependson spatial structure. Vision research, 39(18):3033–3048, 1999.
[136] Davida Y. Teller. Vision and the Visual System. University of Washington,2014.
[137] Yuichi Tanaka, Toshihiko Komine, Shinichiro Haruyama, and Masao Nak-agawa. Indoor visible light data transmission system utilizing white led lights.IEICE transactions on communications, 86(8):2440–2454, 2003.
[138] K. Narisada and D. Schreuder. Light Pollution Handbook. Number v. 322in Astrophysics and Space Science Library. Springer, 2004. URL: http://www.google.fi/books?id=61B_RV3EdIcC.
[139] David E Blask, George C Brainard, Robert T Dauchy, John P Hanifin, Leslie KDavidson, Jean A Krause, Leonard A Sauer, Moises A Rivera-Bermudez, Mar-garita L Dubocovich, Samar A Jasser, et al. Melatonin-depleted blood frompremenopausal women exposed to light at night stimulates growth of humanbreast cancer xenografts in nude rats. Cancer research, 65(23):11174–11184,2005.
[140] Eric Warrant Marie Dacke. Visual orientation and navigation in nocturnal arth-ropods. Brain Behav Evol, 75:156–173, 2010.
76
[141] Emma L Stone, Gareth Jones, and Stephen Harris. Conserving energy at acost to biodiversity? impacts of led lighting on bats. Global Change Biology,18(8):2458–2465, 2012.
[142] Stephen M Pauley. Lighting for the human circadian clock: recent researchindicates that lighting has become a public health issue. Medical hypotheses,63(4):588–596, 2004.
[143] Helen R Wright and Leon C Lack. Effect of light wavelength on suppression andphase delay of the melatonin rhythm. Chronobiology international, 18(5):801–808, 2001.
[144] Richard G Stevens and Mark S Rea. Light in the built environment: potentialrole of circadian disruption in endocrine disruption and breast cancer. Cancer
Causes & Control, 12(3):279–287, 2001.
[145] Christopher CM Kyba and Franz Hölker. Do artificially illuminated skies affectbiodiversity in nocturnal landscapes? Landscape Ecology, 28(9):1637–1640,2013.
[146] Pierantonio Cinzano, Fabio Falchi, and Christopher D Elvidge. The first worldatlas of the artificial night sky brightness. Monthly Notices of the Royal Astro-
nomical Society, 328(3):689–707, 2001.
[147] Ron Chepesiuk. Missing the dark: health effects of light pollution. Environ-
mental Health Perspectives, 117(1):A20–A27, 2009.
[148] Catherine Rich and Travis Longcore. Ecological consequences of artificial
night lighting. Island Press, 2013.
[149] A Bierman. Will switching to led outdoor lighting increase sky glow? Lighting
Research and Technology, 44(4):449–458, 2012.
[150] Fabio Falchi, Pierantonio Cinzano, Christopher D Elvidge, David M Keith,and Abraham Haim. Limiting the impact of light pollution on human health,environment and stellar visibility. Journal of environmental management,92(10):2714–2722, 2011.
[151] Kevin J Gaston, Thomas W Davies, Jonathan Bennie, and John Hopkins. Re-view: Reducing the ecological consequences of night-time light pollution: op-tions and developments. Journal of Applied Ecology, 49(6):1256–1266, 2012.
[152] Franz Hölker, Timothy Moss, Barbara Griefahn, Werner Kloas, Christian CVoigt, Dietrich Henckel, Andreas Hänel, Peter M Kappeler, Stephan Völker,Axel Schwope, et al. The dark side of light: a transdisciplinary research agendafor light pollution policy. 2010.
77
A APPENDIX
A.1 Direct Communication With The Arduino Board
The first layer of the API could be used to program the Arduino board to send inform-
ation by directly communicating with the Arduino. A program built with this API can
take care of the characteristics of the frequencies, their lifetime, output pins and etc.
In this section we demonstrate the use of the API for this purpose by making ex-
ample applications. The second layer of the API could be used in the later stages
for a higher level of communication. The first step is to include the header file and
instantiate necessary objects.
#include <VLCTX.h>
FREQS freqs;
VLCTX vlctx;
There are mainly two classes in this API and instances of both classes are necessary
for a program to function properly. Class FREQS simply defines the frequencies that
are going to be used for this communication and class VLCTX defines the lifetime of
frequencies, output pins and methods for generating these frequencies. This class also
has some built-in methods for developing test units which can automatically generate
bit patterns for test purposes.
The second step is to construct the objects with desired characteristics. This is to be
done in the void setup() function with other initiative tasks. There are two constructors
for this purpose called FREQS_init.
If no arguments are passed to the constructor method, all variables will be set by
their default values. This is helpful for fast prototyping where one does not want to get
involved with much details. The second constructor will take the number of different
frequencies as the first argument, and then the period of half-cycle of each frequency
in microseconds. In the example below we will have three frequencies namely 3KHz,
78
2.4KHz and 1.992KHz.
freqs.FREQS_init(3, 165, 208, 251);
This construction has to be done before constructing the object of the VLCTX class
as the construction of the VLCTX class requires a reference to an object of FREQS
class as its first argument. The second argument is the lifetime of each frequency in
terms of microseconds and the third argument is the number of output pins on the
Arduino board. The next arguments are the actual pin numbers of the output. In this
example the lifetime of a frequency is set to 6600 microseconds which is one fifth of
the lifetime of an entire frame of a 30fps video, there are two output pins and they are
pin number 8 and 12.
vlctx.VLCTX_init(freqs, 6600, 2, 8, 12);
The communication with the Arduino board is made through the serial port. There-
fore the initialization of the serial port is also done in the void setup() function. In
case one desires to use any other means of communication (e.g. WiFi, Ethernet), they
should provide the necessary interfaces to their Arduino sketch. The final setup() func-
tion would be similar to the one provided below.
void setup()
freqs.FREQS_init(3, 165, 208, 251);
vlctx.VLCTX_init(freqs, 6600, 2, 8, 12);
Serial.begin(9600);
Serial.flush();
The next step is to generate all the necessary bit patterns in the loop() function. The
simplest way of doing so is to read characters one by one from the serial port. For this
purpose one can use the Start_Logic(int) function of the API in the following form.
void loop()
if(Serial.available()>0)
vlctx.Start_Logic(Serial.read());
79
Note that the serial communication, timing and delays should be taken cared of the
programmer. By default, during the idle mode, the program will keep all the output
pins in their HIGH mode. Note that this API does not have any prevention mechanisms
to inform the user about sending ”wrong” characters through the serial port and any
character that could not be found in the defined scope will eventually translate to some
random frequency.
In case one does not want to send any information through communication, but
to provide the information in a built-in manner at pre-compile time, they can use
Start_Str(String) method which takes the bit pattern as a string as its argument. In
the example below F0, F2 and F1 will be generated right after each other.
vlctx.Start_Str("021");
In order to generate automatic bit patterns one of the Auto_Start methods could be
used. These methods act by taking advantage of different permutations of a bit pattern
and are useful for testing the robustness of the system. For example, the function call
below generates the 5th permutation of the bit pattern ”012” i.e. ”201”, repeats this
permuted bit pattern for 6 times and places a 200 milliseconds of delay between each
repetition.
vlctx.Auto_Start(5, "012", 200, 6);
Note that if the user sets the delay to 0, the program will automatically place a 493
milliseconds of delay (i.e. 15 frames including the delay introduced by the microcon-
troller) between each repetition. The reason for this is to make the testing environment
more isolated by providing enough gap between each iteration. In this way we reduce
the chance of these bit patterns overlapping on each other and confusing the receiver, it
will be also easier to study the behaviour of different cameras if enough gap is provided
in the test bench. In case the user does not want to place any delay between the itera-
tions they can pass a negative integer value for this argument.
The function call below will iterate 25 permutations from the 10th to the 35th of
”01234” (i.e. all permutations from ”02413” to ”12430”) with 400 milliseconds of
delay between each iteration. This process will be repeated twice before the function
returns.
80
vlctx.Auto_Start(10, 35, "01234", 400, 2);
It is also possible to generate a portion of all possible permutations. In the example
below, the final quarter of all 120 possible permutations of ”01234” will be generated.
vlctx.Auto_Start("01234", 400, 3, 4);
This method does not support built-in repetition at the moment and the user should
be careful about selection of the portions. For example the following statements do not
cover all possible permutations in the intended scope as 120 is not divisible by 7.
for (int i=0; i<7; i++)
vlctx.Auto_Start("01234", 400, i, 7);
A.2 Interfacing The Serial Connection
The second layer of the transmission API makes it possible to make applications that
can communicate with the Arduino board through serial connection. This however
requires the Arduino board to be pre-programmed properly.
To begin, it is necessary to include the header file of the API.
#include "srlintrfc.h"
After that, a number of objects can be created using the SerialInterface class. This
class takes care of serial port initialization and communication.
vlc::SerialInterface srlInt;
This class provides four polymorphic constructors that can initialize the serial con-
nection. The developer has the liberty in selecting either baud rate or serial port or
both. In case no initial arguments are given to the constructor, the default values for
baud rate and serial port will be 9600 and ”/dev/ttyACM0” respectively.
srlInt.SerialInterface("/dev/ttyACM0", 9600);
81
The following methods can be used to send data over the serial connection. The
input can be a standard string or a pointer to a character (beginning of the string). The
string that is to be transmitted can be a set of numbers that indicate the frequency index
to form the bit pattern for Arduino board, for example ”23145” means generation of
F2, F3, F1, F4 and F5 in the same order. This method returns a negative integer value
on error or the total number of sent characters on success.
srlInt.serialport_write("23145");
This API also prototypes a simple lookup table for character conversion. This
means that if the user wishes to send the word ”Hello!” the built-in functions of the API
can convert every character of the string into proper bit patterns that can be transmitted
by the Arduino board. The same lookup table is implemented in the receiver side to
detect and translate the bit patterns into characters. An example of this lookup table
is presented in Section A.3. To use this functionality the following method can be
used. This method returns a negative integer value on error or the total number of sent
characters on success.
srlInt.serialport_directwrite("Hello VLC!");
Finally, the following method can be used to read from the serial connection. How-
ever, this requires the Arduino board to be programmed in way to send data (acknow-
ledgement) back. This method is used mostly for debugging purposes. The first argu-
ment is a pointer to a pre-defined buffer. This buffer is used to store the values that
are read from the serial port. The second argument is a character specified to indicate
the end of message. The Arduino board should send the ”until” character at the end
of each acknowledgement. The return value of this method is the total number of read
characters.
srlInt.serialport_read_until(buffer, until);
82
A.3 Character Lookup Table
Table A.1 shows an example how the presence of each frequency can be interpreted as
a character in upper layers of communication.
The binary value for each frequency in Table A.1 indicates its presence in the cur-
rent video frame. If no frequencies are detected then the frame is considered to be
empty with no characters assigned to it. The defined set column in Table A.1 refers to
a set of frequencies that will be requested from the microcontroller to be transmitted.
For example ”00000” means F0 should be transmitted five times sequentially (filling
exactly one video frame). Similarly ”00012” means F0 will be transmitted three times
followed by a F1 and a F2. The reason that F0 is repeated more than other frequencies
in this case is to increase the SNR for weaker (i.e. higher) frequencies.
Note that the sequence of frequency appearance does not matter in this transmis-
sion. Meaning that ”00011” and ”01010” are considered to carry the same logic. This
is because one a video frame is converted to its Fourier form with DFT function, we
lose the spatial information of the video frame. Meaning that there are no information
about which frequency appeared first and which one appeared last. The solution to this
problem is to narrow down the DFT window as explained in Chapter 6.2 down to the
lifetime of a frequency (explained in Chapter 4.2.4) and then move the DFT window
row by row followed by a Hanning window. This method yields in higher bitrate to the
cost of computation complexity.
83
Frequencies
F4 F3 F2 F1 F0 Defined Set Assigned Character
0 0 0 0 0 none RESERVED
0 0 0 0 1 ”00000” ’b’
0 0 0 1 0 ”22222” ’d’
0 0 0 1 1 ”00011” ’g’
1 0 0 0 1 ”00044” ’j’
0 1 1 0 0 ”22233” ’n’
0 0 1 1 1 ”00012” ’q’
1 1 1 0 1 ”00234” ’2’
1 1 1 1 1 ”01234” ’a’
Table A.1: Character lookup table.
84