Volunteer assignment:

Typist of MIT's calculus course


The goal of this project is to create PEOI's calculus course with complete text, assignments, review and test questions that students can study, take exams and, if exams are successfully passed, receive PEOI's course completion certificate. In other words, MIT's calculus course is currently similar to a book in a library: the text is there, but there are no means for student knowledge assessment, and this is what PEOI wants to change by importing MIT's calculus course text to PEOI, then adding PEOI's assortment of learning tools and knowledge assessment.

The first step in creating PEOI's courses is to have all the text in HTML course pages (from which test questions will be extracted). The problem with this first step for MIT's calculus course is that the text is in pdf files that can't be easily converted to HTML. If you have the means to accomplish this conversion, it would naturally save a lot of time and effort.
One method that has been suggested is to use InftyReader found at http://www.sciaccess.net/en/InftyReader/index.html . But, its use requires also the download of FineReader OCR. The license for the two softwares is one thousand dollars which PEOI certainly can't pay. The 15 days free trial of InftyReader does not include FineReader OCR, as far as I understand. If someone would like to give the free trial version a test, that would be appreciated. One volunteer has successfully used Foxit Phantom. I suppose there are other commercial choices, but they are not affordable for PEOI. If you have a better alternative, do let us know.

Baring such pdf to HTML conversion, importing MIT's calculus course requires typing by hand or correcting the text that can be extracted from pdf files either using optical character recognition or using the "Save as text" option in Acrobat Reader. The text that is saved by Acrobat Reader 1) loses all layout features originally present in the pdf file; 2) does not retain any special character such as Greek letter often used in mathematics; and 3) merges portions of text from columns of content appearing on the same line in a page.

Some portions of the pdf text can be converted using optical character recognition (OCR) software (some OCR software is available free of charge on the internet). OCR is not possible to use on portions of MIT's calculus text because the text layout is in two side-by-side columns, and OCR mixes the two columns. Moreover, MIT's calculus pages contain graphs and pictures which OCR can't deal with, and these will need to be extracted separately. Finally, as you can expect, calculus has a lot of equations, formulas and special characters which OCR can't handle either.

Thus typing is necessary, either of the entire page or in combination with extracted text from Acrobat Reader or OCR. Today's standard for writing mathematical or scientific equations is MathMl, and this can be conveniently done with MathCast, a free downloadable open source software. Placing equations into images is not acceptable because there is not means of specifying what operations are present in the equation. Once again, typing is unavoidable, and learning how to write equations in MathCast is necessary for this project.

The text of MIT's calculus course can be viewed at MIT's opencourseware , or more conveniently in PEOI's "Upload document" procedure where you will find a copy of each chapter pdf file and the text extracted in txt format from the the original pdf file with Acrobat Reader. To access "Upload document" procedure, you must be registered at PEOI and logged in. As all procedures, you find "Upload document" in the Procedure pull-down menu at the top of the screen after log in. Once in the procedure select course MS221EN and the chapter you want. Toward the bottom of the screen a "Work status schedule" shows the work done in each of chapter sections, and you can choose a chapter where sections have not been completed. When a chapter opens, a link in the middle of the screen gives access to the pdf MIT's text which then opens in a new window. For instance, the original pdf file for Chapter 6 of MS221EN is MITRES_18_001_strang_6.pdf on which you can right click and use "Save link as ... " to download it to your computer.

As indicated above, each MIT's calculus chapter is divided into sections. Each of your typed text must contain one entire section of a MIT's calculus chapter. Each chapter section can contain any where from one to ten pages of the original text. Your typed text can be written in any word processor, but the equations must be written in MathMl, which is further explained below. Then a chapter section must be saved with a name such as "ch1-a" for the first section of chapter 1, in either a .txt or .rtf file (i.e. doc or other formats will not work because of the overhead they contain), or better still in html file if you can write the few paragraph, line break and minor other tags, or if you use Open Office Writer which generates rather lean html pages. If you do write in html, make sure that the text and the tags are on separate lines. Once fully typed, each section must be uploaded with the "Upload document" procedure. Subsequently, the uploaded section file is inserted into PEOI's HTML course page for the corresponding section of MS221EN using PEOI's "Import rtf" if your file is rtf, or "Import html" if your file is html. As part of the process of importing your text into PEOI's course page, the equations are extracted saved in javascript, and information about figures and images is entered in PEOI's image data bank.

Once your typed section has been uploaded, it will appear in the "Work status schedule" as done. The typed section must contain an HTML tag for each of the graphs or pictures such as <IMG SRC="figch1-a-1"> for the first figure of chapter 1 first section. The graphs or pictures can be extracted from the screen using "Print screen", saved on your computer giving it a name such as "figch1-a-1" for the first figure of chapter 1 first section, then uploaded using "Upload image" procedure. As noted above, if your typed section is imported in "Import html", the information about each figure or image is extracted from the tags and stored in PEOI's image data bank, and the "Upload image" procedure will already have all the information about the figures or images you must upload.

The equations and formulas must be written using MathMl code, which is most effectively done in MathCast, which is an open source download, and which is rather intuitive to use. MathCast does not contain explanations on how to enter operators, but some help is available at ftp://www.pereboil.net/AREAS/ELECTRONICA/CFGS/TC1/ST/EDITOR%20MATEMATICA/Interface/Help/The%20Rapid%20Mathline.htm . MathCast saves each equation in an individual xml file. The useful portion of this xml code must be inserted at the location in the text where the corresponding equation or formula belongs. As any xml file, MathCast's code starts with the usual <html>, <head> and <body> and ends with </body> and </html>: all of these tags can be discarded. The useful code must start with <math> and end with </math>. In the initial <math> tag, MathCast inserts a link to xmlns="http://www.w3.org/1998/Math/MathML" which is not necessary and can be deleted. Instead, it would be useful (but is not essential) for the <math> tag to contain an indication to which equation or formula the code pertains if such indication appears in the text. For instance, the second equation in the first section of chapter 14 can identified in the <math> by id="ch-14-a-2".

Using "Import html" to create MathMl scripts

The easiest and recommended method to place MathMl code into scripts is to use "Import html". First you must type, or extract for Strang's source PDF section, into the text of the course page file. Second, create the MathMl code for each of the equations using MathCast, save each MathMl file, open each MathMl file in Wordpad or Notepad to remove any overhead, and paste each into your course page you are typing. Make sure to remove the HTML, HEAD and BODY leading and trailing tags in the MathMl code (otherwise the course page you import will be truncated).

Third, once your course page is ready with any overhead removed, upload it using "Upload document" procedure. In "Upload document" open MS221EN chapter and section for which you prepared your course page, click on "Enter and upload new reading", and make sure to give the course page you are uploading a recognizeable name.

Proceed to "Import html" procedure with the same MS221EN chapter and section opening automatically, and observe that your newly uploaded course page file is the names of files that you can "Import from". Select that name and click on "Import from". When the screeen comes back, the procedure has moved each of the MatthMl code sections into scripts, removed that code and replaced it with a tag to the script preceding it with an anchor tag. The script tag is of the format

<A NAME="anchorch2bn256ml"></A> <
SCRIPT SRC="http://www.peoi.net/Courses/Coursesen/calculusmit/script/ch2bmleq2299.js">

If all the scripts appear correctly in the pink "Preview" box you can click on "Save". If the equations are not the same as in the original text, it is necessary to go back to your course page text, verify the MathMl code, make corrections, and upload the course page file again.

It is possible to place your course page file prepared with HTML into the blue TEXTAREA instead of uploading the file. But this is not recommended because keeping a trace of the uploaded file can be useful if corrections are needed.

A less desirable method is to create MathMl scripts directly with "Edit MathMl" procedure, and place corresponding script tag in the course page. Do not attempt to place anchor tags: they must be inserted and maintained by PEOI's procedures automatically.

Proofreading Calculus course

To proofread the calculus course you must look at the files that have been upload for MS221EN chapter assigned to you to determine if the files are rtf or html. If the uploaded files uploaded for the chapter are rtf, select "Import rtf" procedure and open the section you want to proofread. If the files are HTML, select "Import html". Separately you should have a copy of the chapter either printed on paper or open in its own window. It is also recommended to follow the text in a copy of the typed uploaded file which you can download in "Upload document" by right-clicking on the blue "docname" link.

You should check the text that appear in the pink "Preview box" in either "Import rtf" or "Import html" against the typed uploaded document and the original pdf file. In some case corrections can be made directly in the white TEXTAREA, and the section saved by clicking on "Save". But, it is likely to be more productive to make notes of all the corrections in a paper copy of the type document, then decide if make the corrections in the typed page, or in the TEXTAREA. Making corrections in the typed page using either Notepad, Wordpad or some other text only editing is recommended if there are more just or two corrections. The corrected typed page will then have to be uploaded again, as well as imported.

In some cases, the pink "Preview box" can be difficult to read in part because of the anchor taxt that appear in it. The page can be viewed exactly how students will see it by clicking on the blue link to the temporary file appearing just just above the pink Preview box.

More information on how to do this assignment appears in comments of volunteers who are working on this project in "My discussion" which you are urged to read after log in.

All PEOI's tasks can be carried out from your home on your own computer and at your own time. New volunteers are invited to register as volunteer staff members. For more details, please write a short message indicating the volunteer opportunity you are interested in, attach a recent resume, and email to appropriate person in contact information.

To ask questions, and to offer your comments, criticism or suggestion, please mailto:peoi@peoi.org .