GHNSiteArchitectureDocumentation

12
Global Heritage Network Site Architecture Documentation Overview This document provides a detailed understanding of the code structure of the GHN site. Credentials to login, backup and modify the code will be provided by GHF. For further questions on any of these topics please contact [email protected]. A. Environment The GHN site is built on the LAMP stack. Version of the Linux environment can be obtained from the hosting account. We’re using Apache 2.1.2, and mod_rewrite is turned on to structure some of the KML urls. MySQL is version 5.1. The database is being backed up by the hosting company daily. PHP is version 5. For details around the PHP setup, please check: http :// ghn . globalheritagefund . org / sys . php The php.ini file, which contains all the php environment settings, is located at at the web root (/ php.ini) B. Database The database can be accessed through the hosting control panel. All the tables for the GHN website are stored under globamu1_ghn schema. Database connection parameters can be found under the /config folder. The following are the tables that are used for the codebase (dated 12-31-2011): Table name Description countries Contains the standardized list of all countries. Developing countries are marked by the is_developing flag. library_category List of all categories for the document library. Categories are nested using the parent_id field. library_document List of all documents in the library, including the links. Some documents are mapped to site_ids.

description

 

Transcript of GHNSiteArchitectureDocumentation

Page 1: GHNSiteArchitectureDocumentation

Global Heritage Network Site Architecture Documentation

OverviewThis document provides a detailed understanding of the code structure of the GHN site. Credentials to login, backup and modify the code will be provided by GHF. For further questions on any of these topics please contact [email protected].

A. EnvironmentThe GHN site is built on the LAMP stack. Version of the Linux environment can be obtained from the hosting account. We’re using Apache 2.1.2, and mod_rewrite is turned on to structure some of the KML urls. MySQL is version 5.1. The database is being backed up by the hosting company daily. PHP is version 5. For details around the PHP setup, please check:http://ghn.globalheritagefund.org/sys.php The php.ini file, which contains all the php environment settings, is located at at the web root (/php.ini)

B. DatabaseThe database can be accessed through the hosting control panel. All the tables for the GHN website are stored under globamu1_ghn schema. Database connection parameters can be found under the /config folder. The following are the tables that are used for the codebase (dated 12-31-2011):

Table name Description

countries Contains the standardized list of all countries. Developing countries are marked by the is_developing flag.

library_category List of all categories for the document library. Categories are nested using the parent_id field.

library_document List of all documents in the library, including the links. Some documents are mapped to site_ids.

Page 2: GHNSiteArchitectureDocumentation

news List of URLs for all the news channels (RSS feeds)

saved_search_ids Used for mapping user/keyword to search results table. Explained in further detail in the Search section.

saved_search_results_1 A snapshot of a search result. For one user, and one keyword. Results are cached tables following the “saved_search_results_#” pattern. Explained in further detail in the Search section.

saved_search_template This is the template table that the above tables are created from. Explained in further detail in the Search section.

searchables This is the search index. All items are indexed as class/object couples. Explained in further detail in the Search section.

searchable_pages These are external pages for the search index. Typically GHF pages are indexed here. The data is managed through the admin console.

search_log Here, we keep a list of every search ever took place, so that we can analyze what people are searching for.

search_titles This table stores the site names for the autocomplete features. It is short, indexed, and super fast.

sites All the meta information about a site is stored here.

site_admins Lists all site coordinators, and what sites they are managing.

site_kmls List of kmls for sites

site_links List of links for sites

site_photos List of photos for site

site_videos List of videos for sites

userProfiles Details about the user’s profile is stored here. The info about the user is queried from the Ning login page (source code scrape)

users Users are stored here.

user_activity Shows logins, profile updates, etc.

Page 3: GHNSiteArchitectureDocumentation

wiki_cache Used for the wiki cron job, for versioning of the Wiki content.

C. Front-endThe front-end of the application is abstracted as everything that a non-logged in (guest) users can see, and interact with. The main interface is the “explore” page. This page is designed to resize the Google Earth canvas for maximum viewing area (horizontally and vertically). Every page on the site includes a script that initiates common variables, such as threat types, or countries. This script is located at /includes/page_start.php

i. YouTube video:Homepage starts with a video intro, which is loaded from YouTube, using YouTube APIs. At the end of the video, javascript code is triggered to cookie the user, so that he doesnt see the video again until his next visit. The user can also click on the “skip” button to imitate the same functionality. The video ID is configured in the loadIntro() function on the explore_v3.php code.

ii. Google Maps vs. Google Earth:The interactive map on the homepage is provided by Google. If the user has the Google Earth plug-in on their system, it is loaded. If not, then Google Maps is loaded, and a friendly notice is displayed to the user to install Google Earth plug-in, since Google Maps cannot provide the same level of interactivity that Google Earth provides.

iii. Top navigation:The top navigation is frequently updated to meet the changing user and business requirements. So the following list may be outdated. Please contact [email protected] for questions about the current state of the navigation items. The items in the top navigation are configured in the /includes/header.php file. Each of the navigation dropdown is loaded from /includes/submenu_#.php. Here is a snapshot of the links and their functionality:

1. Global Heritage Sites (generic): This links brings down the site-search interface, where the user can browse/search all heritage sites in the database. The search results are displayed at the lower section of the layer, and when the user clicks on a site name, the selected site loads without reloading the page. When a site is loaded, the floating panel

Page 4: GHNSiteArchitectureDocumentation

(see section iv. Floating Panel) will be refreshed with the information about the new site. The floating layer is an iframe that initiates with different parameters like the site_id and the selected tab.

Page 5: GHNSiteArchitectureDocumentation

2. Planning (site-specific): Loads all documents of the category “Plannning”. Documents are grouped by sub-categories, which are managed in the admin console.

3. Conservation (site-specific): Loads all documents of the category “Conservation”. Documents are grouped by sub-categories, which are managed in the admin console.

4. Community (site-specific): Loads all documents of the category “Community”. Documents are grouped by sub-categories, which are managed in the admin console.

5. Partnerships (site-specific): Loads all documents of the category “Partnerships”. Documents are grouped by sub-categories, which are managed in the admin console.

6. Resources (generic and site-specific): Static list of links for additional information, which includes a link to the library, and the Ning community.

7. How You Can Help (generic): Static list of links for user participation.8. GHN Sponsors (generic): Deep link to the sponsors page on the GHF site.9. Visit GHF (generic): Link back to GHF site.

iv. Floating Panel:The floating panel shows specific information for the selected site. By default “Banteay Chhmar, Cambodia” is displayed. The floating panel can be dragged anywhere on the screen and can be collapsed into a short bar to enable larger map viewing area. The code for this panel is located at /iframe_float.php. Some of the functional areas of the panel are:

1. Top section: Shows the title of the site, country name, and also the tag line. The threat level is represented by a colored dot. Below that area, threats are listed with the proper icons. Site images are rotated in the thumbnail size. Links to photos and videos are positioned under the thumbnails. All info here is coming from the sites table, which contains the meta information about sites (see the B. Database section for details)

2. Overview tab: Overview tab loads the wikipedia description for the site. A link to the Wikipedia page is also provided for further reading.

3. Map Layers tab: All the site KMLs are loaded in this box and grouped by KML category. In the admin section, individual KML files can be marked as “load by default”. The user can turn on/off each KML file. When a KML file is selected, the Google Earth plugin will zoom/pan based on the information embedded in the KML file.

4. Bottom section: Here, you will see links to the Ning community for the selected site, as well as a link to contact the site coordinators. At the very end of the screen, depending on login and your access level, you will see two buttons: Edit, Assign. Edit button will bring the admin page for a site, where you can edit everything you see on the explore page. The assign button will open up an overlay for the administrator to assign site coordinators to the selected site.

v. Site Resources:When the user clicks on Resources > All Site Resources from the top menu, a new page will pop-up, displaying all documents for the selected site. You can map documents to a site using

Page 6: GHNSiteArchitectureDocumentation

the administration panel. The code for this page is located at /resources_v2.php.

vi. Document Library:Library opens up with the “featured documents”. These documents are marked through the site administration as “featured”. Results are paginated on the server side, based on the results per page selection above the results. Once the data is loaded, it is sorted on the front-end using jquery sortable table. All entries in the library are stored on the library_documents table in the database, and can be managed from the admin console by site administrators. The library support both external URLs and file uploads. The URLs will only be stored in the database, but the files will physically be copied to the /uploads/library/ folder. They will be named as document_DOC_ID.EXT. The code is stored in the /library_v2.php file. Documents are tagged with nested categories. The categories are managed through the admin console. Entries for the categories are stored in the library_categories table.

D. Back-endThis section requires login. We are tightly integrated with Ning-login page, and verifying login using their page HTML. If Ning makes major changes to the structure (HTML) of their pages, we may need to update our code to scrape the new identifiers. All admin interfaces hit the /myaccount.php code, but then include the proper file from the /includes/ folder, depending on the view they selected, as well as the proper access level. Every page on the site includes a script that initiates common variables, such as threat types, or countries. This script is located at /includes/page_start.php Site coordinators upon login will see a list of all the sites they manage, and the recent changes to those sites. The code is stored at /includes/myaccount_my_contributions.php for this interface. They can also see their pending requests for site management. When they click on a site name, the edit site dialog will come up. Some of the site information, such as classification, location, and title cannot be edited by site coordinators. These form fields will be disabled for them. However, they will be able to update everything else, including kmls, photos, videos and documents. Site administrators will be able to manage pretty much every dynamic content on the GHN site. The following are the admin sections:

i. Manage Sites

Page 7: GHNSiteArchitectureDocumentation

Here the admins can search for sites with all meta fields. This code is located at /includes/myaccount_admin_sites.php. Once a site is located, click on its title to bring up the Edit site dialog (/manage_site.php). Here you will be edit all information about the site. The view switches between the read/write mode using URL parameters. This page also loads all media associated with the site and makes ajax calls to delete items. Add and edit functionality will bring up secondary form specific to the media type being edited.

1. Google Earth pointer to site location: In the read view, Google Earth loads with a pointer to the site. In the write mode, you can drag the pointer to the exact location of the site.

2. Meta information: Includes Site type, Country/Region, Title, Subtitle, Description, Wikipedia/GHF URL, Ning URL, Is featured. This information is directly stored in the database (table:sites) without further processing. The treat levels and threat types are specified in the /includes/page_start.php, so that they appear consistently across the site.

3. Photos: Adds/edits/removes photos for a particular site. The code is stored at /manage_photo.php and all the actions take place at /manage_photo_actions.php file. The information is stored on the database (table: site_photos). The selected image file is physically copied to the /uploads/photos/ folder. The images are resized to 160x120 pixel for the thumbnail, and 640x480px for the large view. The file names are stored as site_SITEID_PHOTOID_t.jpg (thumbnail), and site_SITEID_PHOTOID.jpg (large). The photos can also be geo tagged using Google Earth plug-in. Some photos contain location information embedded in the file and the PHP code tries to capture that. But this is not successfully all the time due to inconsistent data storage through different cameras, or image processing.

4. Videos: Adds/edits/removes videos for a particular site. The code is stored at /manage_video.php and all the actions take place at /manage_video_actions.php file. The video files are not physical files, but the embed code from internet video services such as YoutTube, or Google Video. The information is stored on the database (table: site_videos)

5. KMLs: Adds/edits/removes KMLs for a particular site. The code is stored at /manage_kml.php and all the actions take place at /manage_kml_actions.php file. The information is stored on the database (table: site_kmls). The uploaded KML and KMZ files are physically copied to the /uploads/kml/ folder. The file names are stored as site_SITEID_KMLID.EXT. Default KML files are dynamically generated from the location information in the database. The code that generates dynamic KMLs for geotagged objects (sites, photos, videos) is stored at /dynamic_kml.php.

6. Documents: Adds/edits/removes documents for a particular site. The details of the functionality is covered in the library section (see vi. Document Library)

ii. Manage UsersHere the admins can search for users that have logged in to GHN using their Ning logins. Since we disabled the intrinsic login functionality on GHN, this section is now for looking up users, instead of editing/removing users. This code is located at /includes/

Page 8: GHNSiteArchitectureDocumentation

myaccount_admin_users.php.

iii. Manage News FeedsThis section contains a sortable list of all news feed channels that are displayed on the front-end (http://ghn.globalheritagefund.org/news.php). To add a new channel, simply copy/paste the URL from the RSS source into the form (code: /add_news_feed.php). The code for the form is located at /includes/myaccount_admin_news.php. The data is cached using SimplePie PHP RSS Reader library (http://simplepie.org/).

iv. Manage Site CoordinatorsThis section is used to map existing users to sites as site coordinators. The mappings are stored at the site_admins table. The code is stored at /includes/myaccount_admin_site_admins.php. In order for a user to end up in this list, he/she needs to be marked as a “site coordinator”. To add a user to this list, go back to the Manage Users section, and locate the user, and click on edit user, and change the user type to “site coordinator”.

v. Manage Document CategoriesHere, you will be able to manage the library categories and their sub categories. The interface is stored in the library_categories table, and the code can be found at /includes/myaccount_admin_doc_categories.php. The organization, drag & drop functionality is provided by jquery.

vi. Manage Searchable PagesExternal pages that are included in the search results are managed here. Simply add a new URL to the list, and the content of the page will be crawled and refreshed daily. The list is stored in the searchable_pages and the interface is stored in the /includes/myaccount_admin_searchable_pages.php. More details in the search section (see E. Search)

vii. KML Test AreaThis functionality is provided to test the KMLs before they are deployed on the site. The code that displays the uploaded KML is stored at /kml_demo.php. The uploaded KML gets added to the /uploads/kml/FILE_NAME.EXT. The system keeps only 1 file at a time.

viii. Manage DocumentsManaging document is done inline on the library. When the user is logged in, add/edit/delete buttons are displayed on the library page. Please refer to the library section for technical details

Page 9: GHNSiteArchitectureDocumentation

(see vi. Document Library).

E. SearchThe search on GHN is carried out on the database. The following objects are searchable:

● Sites● Photos● Videos● KMLs● Documents● Web pages

Every time one of the objects above are created, updated or deleted, the search index is updated. The php code is located at /search_v2.php, which later includes /includes/search_*.php files for displaying different types of objects. All information is stored in the searchables table. For example, a photo is indexed as

● class=”photo”● object_id=site_photos->oid● search_text=cleaned up text (title, description, meta data)

* search_text field is indexed using MySQL FULLTEXT index feature. Search results are cached for each search term/session combo. Which means, the user will always get the same results for the same keywords during one session. When a user searches for a term, the code does a quick look up on the search_terms table, to see if this user has done this search before. If so, grabs the table name for the results. If not, it creates a new entry and also creates a new search results table from the template search_results_template table. At the end of every day, a cron job runs to clear inactive entries, and search results and records the search terms in the search_log table, for analysis. The cron job is stored at /cron_search.php.

F. Ongoing Maintenance (Cron jobs)The following jobs run on the server every day at 12:00am. The search job manages the seach tables, and the wiki job manages the wikipedia descriptions for the site.

● /ramdisk/bin/php5 /home/globamu1/www/_subdomains/ghn/cron_search.php● /ramdisk/bin/php5 /home/globamu1/www/_subdomains/ghn/wikiGrabber.cron.php

Page 10: GHNSiteArchitectureDocumentation

G. File structure

Folder Subfolder Description

/ The web root. This folder contains nothing but php files (except for favicon.ico and php.ini)

cache The SimplePie RSS reader caches the remote XML responses here for faster response time. Web access to this folder is restricted throught .htaccess.

classes All object oriented PHP classes are stored here. Web access to this folder is restricted throught .htaccess.

conf Contains the database connection parameters. Web access to this folder is restricted throught .htaccess.

includes Contains all files that are included into PHP files at the root level. Web access to this folder is restricted throught .htaccess. That ensures that the include files are never executed without proper initialization and user authentication.

static All the static content is abstracted under this folder, for future architecture using a CDN.

css All CSS files used in GHN are stored here.

feeds Statically generated RSS files for GHF content are stored here. This content will be rendered on the news page.

flash All Flash movies, and their associated files are stored here.

images All images used in the UI are stored here. Photos, and other elements are dynamically generated under the /uploads/ folder.

jquery All jquery foundation and extension files are stored here. It includes jquery theming files (CSS + images + icons) as well.

js All static javascript files are stored here. The generic Google Earth plug-in management code is here, but the dynamically generated javascript functions are in the explore_v3.php code, and its includes.

kml All static KML files, such as tours, are stored here.

pdf All pdf files, including the user guide, are stored here.

templates These are the email and KML templates that are used for generating new or dynamic content.

Page 11: GHNSiteArchitectureDocumentation

uploads Uploads folder contains all dynamically generated content.

descriptions The descriptions downloaded from Wikipedia are stored here for quick access.

kml Uploaded KML and KMZ files are stored here.

library Uploaded library documents are stored here.

photos Uploaded photos are stored here.

H. URL MappingThe following table shows a mapping of the user friendly URL structure, and the PHP file it redirects to. These settings are stored in the /.htaccess file on the web root.

Friendly URL Mapped File

/sitemap.xml /sitemap_generator.php

/rss.xml /rss_generator.php

/search.php /search_v2.php

/explore.php /explore_v3.php

/index.php /explore_v3.php

/library.php /library_v2.php

/resources.php /resources_v2.php

/ajax_site_photos.php /ajax_site_photos_v3.php

/ajax_site_videos.php /ajax_site_videos_v3.php

/kml/all.kml /dynamic_kml.php

/kml/site_([^/]+).kml /dynamic_kml.php?class=site&oid=$1&icon=true

/kml/pointer_([^/]+).kml /dynamic_kml.php?class=site&oid=$1&icon=false

/kml/photo_([^/]+).kml /dynamic_kml.php?class=photo&oid=$1

Page 12: GHNSiteArchitectureDocumentation

/kml/video_([^/]+).kml /dynamic_kml.php?class=video&oid=$1

/js/ghn_login.js /ghf_login.php

/js/ghn_userinfo.js /ghf_userinfo.php

404 /error_404.php

I. Source Code RepositoryThe source code for this project is maintained on the WUSH.NET cloud services (url: http://wush.net/svn/ghn). Currently, Oguz Olcay and Jesus Jimenez have read/write access, and the web server has read access. The login credentials are not provided in this document, but for additional accounts, please contact [email protected].

J. Questions / CommentsFor further details, please contact Oguz Olcay at [email protected] or on (408) 813-0405.

------- END OF DOCUMENT -------