Domain Specific Language is fast becoming a popular way to describe a problem or a solution for a specific domain. The quality and readability of code using DSL is magnitudes above the regular "technical" code (using Java/C# for eg). Since information about DSL can be googled amply, I am not going to spend time writing on what a DSL is.
In many of the previous posts, I had used pseudo-code, demonstrating parallels in programming to Panini's techniques. Time to call the bluff now. Presented below is a seriously tested code. Here is a DSL that closely models some basic techniques of ashtaadhyaayI, specifically the maheshvara-sutra-s and those darning "it" rules. I'm using Groovy for the implementation, as I feel that it's syntax is more natural to read than that of Scala or Ruby.
Let's define some classes.
In many of the previous posts, I had used pseudo-code, demonstrating parallels in programming to Panini's techniques. Time to call the bluff now. Presented below is a seriously tested code. Here is a DSL that closely models some basic techniques of ashtaadhyaayI, specifically the maheshvara-sutra-s and those darning "it" rules. I'm using Groovy for the implementation, as I feel that it's syntax is more natural to read than that of Scala or Ruby.
Let's define some classes.
[Listing 1: SivaSutra.groovy]
package ch8 import java.util.List /** * Implementation of Maheshvara Sutra using SimpleScript transliteration scheme * The table itself can be moved to a groovy configuration file to allow a different scheme like HK, ITRANS or AST * * @author vsrinivasan */ @Singleton class SivaSutra { //siva-sutraani List table = [ ['a', 'e', 'u', 'N'], ['r.', 'l.', 'k'], ['E.', 'o', 'n'], ['i', 'O.', 'c'], ['h', 'y', 'v', 'r', 't'], ['l', 'N'], ['n.', 'm', 'n', 'N', 'N.', 'm'], ['J', 'B', 'n.'], ['G', 'D', 'D.', 's.'], ['j', 'b', 'g', 'd', 'd.', 's'], ['K', 'P', 'C', 'T', 'T.', 'c', 't', 't.', 'v'], ['k', 'p', 'y'], ['s', 's.', 'S', 'r'], ['h', 'l'] ] List list = table.flatten() int indexOf(String varna) { list.indexOf(varna) } @Override Iterator iterator() { list.iterator() } //eShaam antyaaH it List itMarkers = table.collect { it.last() } /** * is this iT-marker? * this finds only 'pratyahara iT' is defined, for other it-s see ItRules.groovy * * @see ItRules */ boolean isIt(f) { itMarkers.contains(f) } /** * expands a given pratyahara, including all the iT-s * not for practical purposes, but good for testing * * @param pratyahara * @return */ List expand(String pratyahara) { def (begin, end) = pratyahara.varnas() list[begin..end] } /** * returns the real pratyahara varna-s, excluding the intermediate it-markers * very procedural implementation, need to make it groovy-like * * @param pratyahara * @return */ List collect(String pratyahara) { def (begin, end) = pratyahara.varnas() boolean start = false def result = [] table.each { line -> line.each { item -> if (item == begin || start) { if (item != line.last()) { result << item start = true } if (item == end && item == line.last()) { start = false } } } } return result } }
[Listing 2: ItRules.groovy]
package ch8 @Singleton class ItRules { //#(1.3.2) upadeshe ajanunaasika iT, anunAsika-s are denoted by a "-" at the end, // may be M would be a better option? def ajanunasika = 'aAeEuUr.R.l.E.IOO.'.varnas().collect { it + "-" } //cutU def cu = 'cCjJn.'.varnas() def tu = 'tTdDN'.varnas() //s.asca, denoting as "sha" for convenience def sha = ['s.'] //lasaku (ataddhite) def ku = 'kKgGn'.varnas() def lasaku = 'ls'.varnas() + ku //some more to be defined //#(1.3.3) halantyam - check if the last char is hal SivaSutra sivaSutra = SivaSutra.instance boolean hasHalantyam(String pratyaya) { pratyaya.varnas().last() in sivaSutra.hl } //allItMarkers except hal, which is applicable only to last letter def allItMarkers = ajanunasika + cu + tu + lashaku boolean isAnunasika(String varna) { varna.endsWith('-') } boolean isItMarker(String varna) { varna in allItMarkers } String tasyaLopah(String pratyaya) { (pratyaya.halantyam().varnas() - allItMarkers).join() } }
[Listing 3: Main.groovy]
package ch8.tests import java.util.List import ch8.ItRules import ch8.SivaSutra import ch8.schemes.SimpleScriptScheme import ch8.Samjna /* * DSL: varnas() closure - tokenize the script into individual varnas (list) */ String.metaClass.varnas = { new SimpleScriptScheme().tokenize(delegate) } /* * DSL: halantyam() closure - remove the last hal iT and return the modified String */ String.metaClass.halantyam = { ItRules itRules = ItRules.instance def varnas = delegate.varnas() as List if (itRules.hasHalantyam(delegate)) { varnas.remove(varnas.size()-1) } varnas.join() } /* * DSL: tasyaLopah() closure - remove all the it-markers from a pratyaya */ String.metaClass.tasyaLopah = { ItRules itRules = ItRules.instance itRules.tasyaLopah(delegate) } /* * DSL: Direct exposition of a pratyaya or a pratyahara! */ SivaSutra sivaSutra = SivaSutra.instance sivaSutra.metaClass.getProperty = { String pratyahara -> def metaProperty = SivaSutra.metaClass.getMetaProperty(pratyahara) def result if(metaProperty) { //if there is an existing property invoke that result = metaProperty.getProperty(delegate) } else { //inspect the property and convert it to varnas //taparastatkaalasya rule; need to formulate in a better way if (pratyahara.endsWith('t.')) { result = (pratyahara - 't.').varnas() } else { result = sivaSutra.collect(pratyahara) } } result } void testSivaSutra() { SivaSutra sivaSutra = SivaSutra.instance sivaSutra.table.each { println it } //print the maheshvara sutrani println sivaSutra.list //print a flattened version of the maheshvara sutrani sivaSutra.each { println it } //another way to print flattened maheshvara sutrani println sivaSutra.itMarkers //print only the it markers assert sivaSutra.isIt('n.') //check if n. is an it marker assert sivaSutra.expand('ak') == ['a','e','u','N','r.','l.','k'] //expand pratyahara including the it assert ['a', 'e', 'u']== sivaSutra.collect('ak') //pratyahara excluding iT assert ['a', 'e', 'u']== sivaSutra.ak //another way of getting the pratyahara! Meta-programming in play! } void testItRules() { ItRules itRules = ItRules.instance println itRules.ajanunasika //prints all the ac anunasikas assert "lyut".varnas() == ['l', 'y', 'u', 't']} void testHalantyamRule() { //print the pratyahara-s after the halantyam rule applied ["kt.va", "Gan.", "kt.vat.", "sap", "lyu-t", "saN", "sat.r."].each { println it + " = " + it.halantyam() } assert 'kt.va' == 'kt.va'.halantyam() assert 'kt.va' == 'kt.vat.'.halantyam() assert 'Ga' == 'Gan.'.halantyam() } void testTasyaLopahRule() { ["Gan.", "kt.vat.", "sap", "lyu-t", "saN", "satr."].each { println it + " = " + it.tasyaLopah() } assert 'a' == 'Gan.'.tasyaLopah() assert 't.va' == 'kt.vat.'.tasyaLopah() } void testSamjnaSutras() { SivaSutra sivaSutra = SivaSutra.instance def vruddhi = sivaSutra.'At.' + sivaSutra.ic def guna = sivaSutra.'at.' + sivaSutra.'E.n' assert ['A', 'i', 'O.'] == vruddhi assert ['a', 'E.', 'o'] == guna } testSivaSutra() testItRules() testHalantyamRule() testTasyaLopahRule() testSamjnaSutras()
[Listing 4: SimpleScriptScheme.groovy]
package ch8.schemes /** * A simple script tokenizer * * @author vsrinivasan */ class SimpleScriptScheme implements ScriptScheme { // hyphen denotes anunasika static List NotationMarkers = ['.', ':', '-'] /** * split/tokenize a given word into a list of varnas * the word could be a pada, shabda, pratyaya or pratyahara * needs to handle anunasika properly * * @calledby String.metaClass.varnas() * @param word * @return list of varnas */ @Override public List tokenize(String word) { def varnas = [] word.eachWithIndex { c, i -> c = ((i < word.length()-1) ? ((word[i+1] in NotationMarkers) ? (c + word[i+1]) : c) : c) if (!(c in NotationMarkers)) varnas << c } varnas } }
Now some observations and analysis:
- To do this in a regular Java/C# would require several objects, wrapper-classes and utility methods to be created. But using meta programming techniques and defining a clean DSL makes this a very interesting implementation.
- Ability to work directly on strings, lists and maps makes a huge difference, as opposed to wrappers around strings and creating objects like pratyahara, it, pratyaya etc.
- The Main.groovy is self-explanatory in what's given and what's expected. This is not pseudo-code anymore! Note the direct method invocation like varnas(), halantyam(), tasyaLopah() on Strings. And also observe the direct reference to a pratyaya (sivaSutra.ac will expand to a list of vowels). Metaprogramming, awesome or what?
- Also observe the testSamjnaSutras() definitions. The only reason I have to quote the properties is due to the usage of dot in the schema. A symbol-less scheme like AST would make a very readable code.
- The code uses the SimpleScript for devanagari transliteration. As I had mentioned in a previous post, parsing the script is trivial, because of a strict 1:1 mapping between English and Sanskritam letters. Took less than 5 minutes to write it.
- However the code allows to use any transliteration scheme, if one can come up with it, by implementing the ScriptScheme interface. Harvard-Kyoto, ITRANS or AST or even Unicode - as long as the individual varna-s are correctly tokenized, the program will work fine.
- Any script scheme can be supplied via a groovy configuration and read by ConfigSlurper!
assert "bhavati" = bhU + sap + tin //1st gana
assert "kasca" == 'ka:' + sca //scutva sandhi
Imagine being able to work out sandhis just by using the plus sign! (eg ) Wouldn't that be really really cool? And that's not really impossible. It will only take a little more effort to expand the DSL to include anga, guna, operator overriding for sandhi rules etc.!
Imagine similar DSL-s can be implemented for parsing shlokas to determine chandas! The potential for a Samskritam DSL is huge.
4 comments:
['a', 'e', 'u', 'N'] -> ['a', 'i', 'u', 'N'],
bhavitavyaM khalu?
Oh avagataM - anyA lipiH upayujyate.
@vishvas, aam anyaa lipiH asti | As you can see in the SimpleScriptScheme, it was just a one-liner to tokenize. Didnt want to spend time on tokenizing other schemes which someone have already done. Having said that, its not very difficult to substitute this with a standard scheme like HK, ITRANS or IAST.
nUtanalipi - http://xyzcompany.in/2011/12/15/nutanalipi-new-script/
----------
OM
a A i I u U R RR L LL E Y O W M H
k k' g g' q
c c' j j' Q
T T' D D' N
t t' d d' n
p p' b b' m
y r l v
S S' s h
l; x
ka kA ki kI ku kU kR kRR kL kLL kE kY kO kW k[]M k[]H
akArO muk'aH sarvad'armANAm AdyanutpannatvAt ||
d'armO raxati raxitaH |
agnimILE purOhitaM yajQasya dEvam Rtvijam
hOtAraM ratnad'Atamam
sahasraSIrS'A puruS'aH sahasrAxaH sahasrapAt
sab'UmiM viSvatO vRtvAtyatiS'T'ad daSAqgulam
puruS'a EvEdaM sarvaM yad b'UtaM yacca b'avyam
at'AtO d'Atust'adOS'agatyavikArahEtub'UtArt'avArd'akadravyANyadyAt
Post a Comment