Contribute to Gwulo
Primary tabs
Submitted by David on Thu, 2016-05-05 15:57
Your contributions help Gwulo grow. Here are some ideas for ways to contribute:
- Words
What can you tell us about old Hong Kong? Stories, memories, facts & figures, are all are welcome. You can leave a comment on any page if you can add more information about it, or post a new message to the forum. - Photos
We all enjoy looking at photos of old Hong Kong. It's easy to upload a digital photo / scanned image to Gwulo. Donations of original photos are also very welcome. - $1
Help keep the lights on at Gwulo HQ by signing up to make a monthly payment of USD1 or more, or by buying a copy of the Gwulo book or a print of one of Gwulo's photos from the shop. - Friends
Everyone needs more friends! If you have friends or family who are interested in old Hong Kong, please could you suggest they sign up for the free newsletter? - 30 minutes
Join in the project to put Hong Kong's Juror Lists online. It only takes 30 minutes to type a page. - </code>
If you're an expert in PHP or Javascript, and especially if you know Drupal, we need your brain cells! Please leave a comment below and I'll be in touch.
If you've got any questions, or you'd like to contribute in some other way, please leave a comment below.
Regards, David
Forum:
Financial contribution
Hi David,
i would like to contribute to the running costs of Gwulo but do not really want to use the patron system. The reason for this is that I will charged a transaction fee every month by my bank or credit card company.
can I make an annual contribution and set this up as a direct debit on my credit card. In that way I only pay the transaction fee once and that will mean I can increase my contribution.
re: Financial contribution
Hi Thomas,
Several other Patrons asked if they could make an annual payment instead of monthly. Please see suggestions for how to do this at: https://gwulo.com/comment/35546#comment-35546
Thanks for your support!
Best regards, David
1930 Jurors list
I used an OCR software and converted the 1930 Jurors list into a WORD document and a searchable PDF file. Conversion into an EXCEL file was unsuccessful.
You can take a look at the results here. They're not 100% accurate but may make importing into EXCEL easier by C&P.
1930 Jurors list
OCR & Jurors lists
I used OCR (Abby Finereader 12) to convert the1941 Jurors list (see https://gwulo.com/jurors-list-1941).
In the end the time to prepare a page (copy & paste, and check & correct errors) was a little bit faster than the current method we use (correct the details from the previous year's list), but more difficult to use in a group, so I've stuck with the current method.
Enhance JPEG before OCR
Hi David,
I split each page of the 1930 Jurors into individual JPEG images, removed the lines and dots from the image that confuses most OCR software and enhance the black level of the words before converting it with an OCR into a TXT file.
This is what I got - a list of the occupations, names and addresses that can be relatively easy to proofread, make corrections and C&P, as long as only one line is allowed for each item.
Banker
Jardine, Matheson & Co., Ld.
Per pro., Mackinnon, Mackenzie & Co.
Merchant
Chief Manager, Bank of East Asia; Ld.
Stock Broker, Geo. & H. A. Lamwert
General Manager, Union Ince. Socty. of
Canton, Ld.
Exchange Manager, Bank of Canton, LL
Director, Reiss, Massey & Ld.
Principal, Little, Adams & Wood
Assistant Manager, Butterfield & Swire.
Resident Partner, Mackinnon
Mackenzie
Butterfield & Swire _
Director, Gilman & Co., Ld.
Shanghai Bank
Caldbeck, Macgregor & Co.
Gen. Manager, Standard Oil Co.
Merchant, J. D. Hutchison & Co.
Merchant, Bradley & Co., Ld.
Manager, Bank of China, Ld.,
Merchant
Exchange Broker
Principal, C. A. da Roza
Merchant, W. R. Loxley & Co
Manager, Mercantile Bank of India,Ld,
Incorporated Accountant, Percy Smith,
Seth & Fleming
Butterfield & Swire
Freight Agent, Canadian Pacific S.S., Ld.
Merchant, Shewan, Tomes & Co.
Merchant, Silva-Netto & Co.
Manager, China Underwriters, Ld.
Borneinann & Co.
Shipping Manager, Jardine, Matheson
& Ca., Ld.
Managing Director, Hong Kong Hotel
Shar:Broker, Tester & Abraham
Dodwell & Co,, Ld.
A. S. Watson & Co., Ld..
Compradore, H.K. & K. W. & G. Co., Ld
Department Manager, Sun Life Insur.
ance Co., Ld.
Leigh & Orange
Ho Kom-tong,
Ho Leung
Johnson, Marcus Theodore .
Joseph, Joseph Edgar -
Kan Tong-po
Lammert, Herbert Alexander.
Lauder, Paul
Lay Kam-fat
Lewis, Brian Lander
Little, Alexander Coulbourne.
Little. John Hargraves
:Mackie, Charles Gordon
Stewart
McHutchon, James Maitland
Miskin, Geoffrey
Murphy, Lewis Newton
Oliver, Roland Edward Henry:
Parker, Philo Woodworth
Pearce, Thomas Ernest
Plummer, John Archibald,
Pui Tso-yi (T. Y. Pei).
Rocha, Joao Maria da
Rodgers, Robert
Roza, Carlos Augusto an
Russell, Donald Oscar
Sandes, Charles Lancelot
Compton
Seth, John Hennessey
Shaw, Thomas Henry Robert
Sheppard, John Oram
Shields, Andrew Lusk
Silva-Netto, Antonio
Ferreira 13atalha
Start, Herbert Rothsay.
Sum Pak-ming,
Siltherland, Robert
Taggart, James Harper
Tester, Percy
Warren, John Percival
Wong, James Mow Lain
Wong Kam-Ink
Wong, 'KWong-tin
Wong-Tape, Benjamin
Wool, Gerald George
7 Caine Road.
On premises.
On premises.
Hong Kong Hotel,.
On premises.
170 The Peak.
On premises.
16 Mosque Street.
11 Peak Mansions.
5 Aighburth Hall, May Road.
188 The Peak.
On premises.
On premises.
104 The Peak.
On premises.
On premises.
AltcAena, The Peak.
299 The Peak.
515 The Peak.
9 Village Road.
3 Robinson Road.
137 The Peak.
3 May Road.
On premises.
Galesend, 302 The Peak.
Deepdene, Deep Water Bay.
On premises.
1.■Hattori Road, Hong Kong.
16 Peak Road.
32 Granville Road.
512 The Peak-.
On premises.
368 The Peak.
On premises.
9 Stewart Terrace, The Peak.
On premises.
On premises.
11 Arbuthnot Road.
Aimai Villas, Kowloon.
Kia Ora, Kowloon. City.
On premises.
If you'd like to try
If you'd like to try transcribing some years' lists with OCR, I suggest you work on some of the missing lists from 1893 and earlier (you can see the list of which years we've got at https://gwulo.com/node/6706).
I'd originally planned to go back to these once we've finished the 1930s, but it'd be great to get a head start on them with your help. It'll also let you see how long it takes on average to produce an accurate page, and we can compare that with the current method.
1881 Jurors list
I tried this list with OCR. Since the original file is in pretty good shape, I did not bother to do any enhancement and loaded the PDF into my Nuance PDF Converter Professional and had it converted into a readable and editable PDF file. Then I C&P'ed the names column block by block into a text editor, from a few lines to a page at a time depending on how the text was highlighted by the software, to prevent them from all mixed up, & ditto with the occupations column. Then I did the proofreading, corrections and imported it into Excel. The .............. behind the names were not removed as it requires line-by-line editing. If I have to photoshop enhance the original file, I would lasso and remove them all in one shot.
Total time taken: C&P 18 minutes + proofread 64 minutes + import to Excel 3 minutes = 85 minutes for 8 pages.
Tan King Sing was listed as the manager of an opium farm on Bonham Strand.
1881 Juror List
Windows 10 One Note OCR
I next tried using Win10's built-in OneNote to do the OCR, with the original PDF pages split into individual scan pages first and cropped each column out for individual conversion to minimize confusion to the OCR software. No C&P of the OCR output was needed. Then I enhanced the same PDF column and repeated the conversion to check for any difference. It seems that the better the original JPEG file the more accurate the conversion. More time is needed to prepare the files before OCR, but less time in proofreading the more accurate output file.
Corrections can be made right in OneNote with the PDF and the OCR output side by side, then imported into EXCEL. Editing this enhanced list took me 5:55 minutes.
The XLS you made with Nuance
The XLS you made with Nuance looks better. A quick glance at the OneNote shows mistakes in lines 3 & 4 of the first column of OCR'd text, and even more errors in those lines in the column of OCR text from the enhanced PDF. So Nuance looks the way to go.
Next steps to put the Nuance file online will be:
Let me know if you need any help with any of these steps.
OCR output
Hi David,
The Nuance file was edited and the OneNote ones were pre-edit. The OCR output is very much dependent on the quality of the original PFD scan. Quite a lot of the alphabets were broken and not continuous, resulting in d being read as cl and H being read as I-I etc. I think it may be faster to use OCR, separate each PDF into 3 long columns and put each column side by side with the original scan to do the proofreadings and corrections, as it took me only 85 minutes to do this 2 column list of 8 pages.
Understood, thanks. The 1881
Understood, thanks. The 1881 list looks good, thank you for posting that. I've added the standard headers, and slotted it in to the list of Jurors lists at https://gwulo.com/node/6706
If you can work backwards or forwards from there to add in any more years, they will be gratefully received!
Typhoons
While living in Hong Kong, 59-63, there were some pretty heavy typhoons. We had to put shutters up at doors and windows to balcony and sit it out until somebody came round with an “all clear” message.
Went back 2 years ago and visited a museum which shows the devastation caused by the strongest typhoon while we lived there. All we had was many sea creatures, sand and wood on balcony but the damage was immense.
Hi David,
Hi David,
I tried another way to do the lists. This one is a combination of your present method, OCR and EXCEL's database format. This may be faster as one does not have to compare a long row of data in a spreadsheet with another long row in the PDF.
1) Open the PDF and open the previous year's EXCEL file, then click Data > Form to open the database form. Place the DB form just below the PDF for easier comparison. Delete N/A old entries, correct typos and highlight the new jurors in the PDF for entry into the DB all in one shot later to avoid having to scroll back and forth through hundreds of entries.
2) After finishing with all the pages that one wants to do, enter the new jurors by opening an OCR PDF below the original OCR. Either C&P from the OCR or type the info into the DB forms.
3) After it's all done, sort the DB.
I've done the 1931 Special Jurors list, will email to you.
Thanks for the extra
Thanks for the extra investigation, and the file by email. I guess most contributors will stick to the simpler approach, but you're clearly very comfortable with these tools so if they save you time then they're good news!
Regards, David
Hi David,
Hi David,
I searched for my late father , Charlie Leung Chung-Yee and came across your website with BAAG tag.
I have a lot of authenticated information about my father. I wonder the best way for me to contribute to this forum with addtional information about my late father.
Regards
Alfred
Charlie Leung Chung-Yee
Hi Alfred, sorry for the slow reply. I see you've already started adding information to his page at https://gwulo.com/node/23631. Please go ahead and add any more information about his time with the BAAG to that page. It'll also be good if you could add the Chinese characters for his name.
I've merged the separate page you'd made for your father back into the main one.
Thanks & regards, David
Patrick Henry Murray and Lucretia Mary Reed
I had reached a dead end as I had no info on Patrick Henry Murray and limited info on Lucretia Mary Reed. I now have added much to my family tree In Ancestry.ca. Very much appreciated. ---Thomas Andrews, great-grand son. Canada
Home Addresses of the Jurors?
No idea that there's a Jury system in the British Hong Kong Government. Are home addresses of the Jurors available somewhere so I can assure that the people so named on the Juror list were my ancestors?
re: Home Addresses of the Jurors?
The addresses given in the Jurors Lists are a mix of residential and office addresses, but it's often possible to guess which type they are by the address given. If you need any help with an address it's best to ask the question in a new forum post: https://gwulo.com/node/add/forum/2