Now, all Indian languages in cyberspace: The Hindu



Bangalore: The Centre for Development and Advance Computing hit a significant milestone on Tuesday in its journey to bridge the digital divide in India.

In a medium where English is the presiding deity, the CDAC’s release of fonts and software tools in six languages — thus completing the releases in all 22 official Indian languages — is a significant step towards increasing Indian language content on the Internet.

Packages in Bangla, Konkani, Kashmiri, Sindhi, Manipuri and Santali (all except Bangla available in two scripts), along with 16 others, are now available free for download or can be ordered from the Indian Language Data Centre website ( www.ildc.in). A quick peek at the Hindi Firefox browser reveals a completely customised browser with only the logo and the URL in English. Besides True Type fonts and Unicode-compliant Open Type fonts that can be installed, the CD offers localised avatars of software tools such as a full office Suite Open Office (3.0), FireFox Browser, Pidgin (a multiprotocol messenger), Thunderbird email client and a content management system.

So, seven lakh CDs (delivered on demand) and 39 lakh downloads later, the GIST (Graphics and Intelligence-based Scripting Technologies) team at the CDAC is now looking ahead. Mahesh D. Kulkarni, programme coordinator, GIST, says: “People think language tools are expensive and is not growing. So we offered this basic information processing tool — with fonts and functions like word processing and email — free of cost to the common man.” This will increase local content on the web and drive futuristic localised technologies such as Optical Character Recognition, grammar check and machine-assisted translation systems.

Given that one in every six villages is slated to get a Common Service Centre (CSC) providing free e-services to the masses (as envisioned by the NeGP), local language development is critical. However, despite the efforts of the government and independent developers, local language computing is yet to make significant inroads into mainstream digital society. The multiplicity of fonts and scripts and their non-linear nature have resulted in slow development until recently. The CDAC, too, did not get ahead with its first package till 2005, rather late by Internet standards.

Experts point out that though significant funds have been devoted to font development, the non-free terms of licensing and distribution, and the lack of quality control dampen these initiatives. Hariprasad Nadig, part of the Kannada localisation team of Sampasa, says releases so far have been ridden with bugs.

For instance, some packages (when tested randomly) did not install on Debian (a GNU/Linux) operating system. “Even the translations are so Sanskritised and unprofessionally done that it may put off the lay user. There aren’t any quality checks and a random assortment of packages seems to be thrown in,” Mr. Nadig points out. That many of these packages, like the messenger for instance, will need updates which may not be possible is another key issue.

Such issues can be solved by making the code available in public domain, instead of using restrictive licenses, Mr. Nadig says. Using a GNU Public Licence (a free software licence) will help to utilise local developers’ talent without returning to (and paying) vendors each time it needs to be modified.