Offline Wikipedia reader

From Openmoko

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
 +
{{application|offline wikipedia}}
 +
The Offline Wikipedia reader is a set of scripts and programmes which can be used to display wikipedia pages without an internet connection.
 +
 +
The software functions by running a lightweight webserver on the phone, and using php to present the pages, which are then viewed using any web browser.
 +
 
Instructions can be found here:
 
Instructions can be found here:
 
http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
 
http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
  
Initial indexing should be carried out on a desktop/laptop, as the freerunner is not powerful enough
+
Before the pages can be displayed, an indexer needs to be run - this should be carried out on a desktop/laptop, as the freerunner is not powerful enough - this takes approximately one hour on a dual-core 1.1GHz cpu, 1.5GB RAM laptop.
  
{{application|offline wikipedia}}
+
Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.
  
This project provides software to store the entirety of Wikipedia (any language) locally on a Linux device.
+
All Wikipedia pages are downloaded from the Wikipedia page dump - [http://download.wikimedia.org/enwiki/20081008/]. The file needed is called 'pages-articles.xml.bz2', and the most recent is 4.1GB
 
+
All Wikipedia pages are downloaded from the Wikipedia page dump - [http://download.wikimedia.org/enwiki/20081008/]. The files needed is called 'pages-articles.xml.bz2', and the most recent is 4.1GB
+
  
 
The English Wikipedia (text only) is around 6GB including indices, so the entire content can be stored on one 8GB card. The German Wikipedia is approximately 1/4 the size, so can be stored on a correspondingly smaller card; ditto for other languages.
 
The English Wikipedia (text only) is around 6GB including indices, so the entire content can be stored on one 8GB card. The German Wikipedia is approximately 1/4 the size, so can be stored on a correspondingly smaller card; ditto for other languages.
  
The software functions by running a lightweight webserver on the phone, and using php to present the pages, which are then viewed using any web browser.
 
  
 
=Development status=
 
=Development status=
 
+
At present, a single tar.bz is downloaded from the site above, the pages extract downloaded and copied to the correct location, and the indexing process run. Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.
At present, a single tar.bz is downloaded from the site above, the pages extract downloaded and copied to the correct location, and the indexing process run (takes approximately one hour on a dual-core 1.1GHz cpu, 1.5GB RAM laptop). Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.
+
  
 
In the future the software will be released as an ipk. Also a deb/rpm for a desktop distributions, to allow automatic download of the most recent wikipedia dump, with a diff utility to allow it to be updated and re-indexed.
 
In the future the software will be released as an ipk. Also a deb/rpm for a desktop distributions, to allow automatic download of the most recent wikipedia dump, with a diff utility to allow it to be updated and re-indexed.

Revision as of 04:05, 12 January 2009

offline wikipedia is one of the applications that runs on the Openmoko Phones. For a list of all applications, visit Applications The Offline Wikipedia reader is a set of scripts and programmes which can be used to display wikipedia pages without an internet connection.

The software functions by running a lightweight webserver on the phone, and using php to present the pages, which are then viewed using any web browser.

Instructions can be found here: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html

Before the pages can be displayed, an indexer needs to be run - this should be carried out on a desktop/laptop, as the freerunner is not powerful enough - this takes approximately one hour on a dual-core 1.1GHz cpu, 1.5GB RAM laptop.

Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.

All Wikipedia pages are downloaded from the Wikipedia page dump - [1]. The file needed is called 'pages-articles.xml.bz2', and the most recent is 4.1GB

The English Wikipedia (text only) is around 6GB including indices, so the entire content can be stored on one 8GB card. The German Wikipedia is approximately 1/4 the size, so can be stored on a correspondingly smaller card; ditto for other languages.


Development status

At present, a single tar.bz is downloaded from the site above, the pages extract downloaded and copied to the correct location, and the indexing process run. Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.

In the future the software will be released as an ipk. Also a deb/rpm for a desktop distributions, to allow automatic download of the most recent wikipedia dump, with a diff utility to allow it to be updated and re-indexed.


150px Offline Wikipedia reader

Read entirety of Wikipedia offline


Homepage: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
Package: no package yet
Tested on: -

Personal tools

offline wikipedia is one of the applications that runs on the Openmoko Phones. For a list of all applications, visit Applications The Offline Wikipedia reader is a set of scripts and programmes which can be used to display wikipedia pages without an internet connection.

The software functions by running a lightweight webserver on the phone, and using php to present the pages, which are then viewed using any web browser.

Instructions can be found here: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html

Before the pages can be displayed, an indexer needs to be run - this should be carried out on a desktop/laptop, as the freerunner is not powerful enough - this takes approximately one hour on a dual-core 1.1GHz cpu, 1.5GB RAM laptop.

Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.

All Wikipedia pages are downloaded from the Wikipedia page dump - [1]. The file needed is called 'pages-articles.xml.bz2', and the most recent is 4.1GB

The English Wikipedia (text only) is around 6GB including indices, so the entire content can be stored on one 8GB card. The German Wikipedia is approximately 1/4 the size, so can be stored on a correspondingly smaller card; ditto for other languages.


Development status

At present, a single tar.bz is downloaded from the site above, the pages extract downloaded and copied to the correct location, and the indexing process run. Dependencies (php, perl ,python) are then installed on the freerunner, and the index and database copied to the freerunner, along with the web server.

In the future the software will be released as an ipk. Also a deb/rpm for a desktop distributions, to allow automatic download of the most recent wikipedia dump, with a diff utility to allow it to be updated and re-indexed.


150px Offline Wikipedia reader

Read entirety of Wikipedia offline


Homepage: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
Package: no package yet
Tested on: -