Domain Specific Language is fast becoming a popular way to describe a problem or a solution for a specific domain. The quality and readability of code using DSL is magnitudes above the regular "technical" code (using Java/C# for eg). Since information about DSL can be googled amply, I am not going to spend time writing on what a DSL is.
In many of the previous posts, I had used pseudo-code, demonstrating parallels in programming to Panini's techniques. Time to call the bluff now. Presented below is a seriously tested code. Here is a DSL that closely models some basic techniques of ashtaadhyaayI, specifically the maheshvara-sutra-s and those darning "it" rules. I'm using Groovy for the implementation, as I feel that it's syntax is more natural to read than that of Scala or Ruby.
Let's define some classes.
In many of the previous posts, I had used pseudo-code, demonstrating parallels in programming to Panini's techniques. Time to call the bluff now. Presented below is a seriously tested code. Here is a DSL that closely models some basic techniques of ashtaadhyaayI, specifically the maheshvara-sutra-s and those darning "it" rules. I'm using Groovy for the implementation, as I feel that it's syntax is more natural to read than that of Scala or Ruby.
Let's define some classes.
[Listing 1: SivaSutra.groovy]
package ch8
import java.util.List
/**
* Implementation of Maheshvara Sutra using SimpleScript transliteration scheme
* The table itself can be moved to a groovy configuration file to allow a different scheme like HK, ITRANS or AST
*
* @author vsrinivasan
*/
@Singleton
class SivaSutra {
//siva-sutraani
List table =
[
['a', 'e', 'u', 'N'],
['r.', 'l.', 'k'],
['E.', 'o', 'n'],
['i', 'O.', 'c'],
['h', 'y', 'v', 'r', 't'],
['l', 'N'],
['n.', 'm', 'n', 'N', 'N.', 'm'],
['J', 'B', 'n.'],
['G', 'D', 'D.', 's.'],
['j', 'b', 'g', 'd', 'd.', 's'],
['K', 'P', 'C', 'T', 'T.', 'c', 't', 't.', 'v'],
['k', 'p', 'y'],
['s', 's.', 'S', 'r'],
['h', 'l']
]
List list = table.flatten()
int indexOf(String varna) { list.indexOf(varna) }
@Override
Iterator iterator() { list.iterator() }
//eShaam antyaaH it
List itMarkers = table.collect { it.last() }
/**
* is this iT-marker?
* this finds only 'pratyahara iT' is defined, for other it-s see ItRules.groovy
*
* @see ItRules
*/
boolean isIt(f) { itMarkers.contains(f) }
/**
* expands a given pratyahara, including all the iT-s
* not for practical purposes, but good for testing
*
* @param pratyahara
* @return
*/
List expand(String pratyahara) {
def (begin, end) = pratyahara.varnas()
list[begin..end]
}
/**
* returns the real pratyahara varna-s, excluding the intermediate it-markers
* very procedural implementation, need to make it groovy-like
*
* @param pratyahara
* @return
*/
List collect(String pratyahara) {
def (begin, end) = pratyahara.varnas()
boolean start = false
def result = []
table.each { line ->
line.each { item ->
if (item == begin || start) {
if (item != line.last()) {
result << item
start = true
}
if (item == end && item == line.last()) {
start = false
}
}
}
}
return result
}
}
[Listing 2: ItRules.groovy]
package ch8
@Singleton
class ItRules {
//#(1.3.2) upadeshe ajanunaasika iT, anunAsika-s are denoted by a "-" at the end,
// may be M would be a better option?
def ajanunasika = 'aAeEuUr.R.l.E.IOO.'.varnas().collect { it + "-" }
//cutU
def cu = 'cCjJn.'.varnas()
def tu = 'tTdDN'.varnas()
//s.asca, denoting as "sha" for convenience
def sha = ['s.']
//lasaku (ataddhite)
def ku = 'kKgGn'.varnas()
def lasaku = 'ls'.varnas() + ku
//some more to be defined
//#(1.3.3) halantyam - check if the last char is hal
SivaSutra sivaSutra = SivaSutra.instance
boolean hasHalantyam(String pratyaya) { pratyaya.varnas().last() in sivaSutra.hl }
//allItMarkers except hal, which is applicable only to last letter
def allItMarkers = ajanunasika + cu + tu + lashaku
boolean isAnunasika(String varna) { varna.endsWith('-') }
boolean isItMarker(String varna) { varna in allItMarkers }
String tasyaLopah(String pratyaya) { (pratyaya.halantyam().varnas() - allItMarkers).join() }
}
[Listing 3: Main.groovy]
package ch8.tests
import java.util.List
import ch8.ItRules
import ch8.SivaSutra
import ch8.schemes.SimpleScriptScheme
import ch8.Samjna
/*
* DSL: varnas() closure - tokenize the script into individual varnas (list)
*/
String.metaClass.varnas = {
new SimpleScriptScheme().tokenize(delegate)
}
/*
* DSL: halantyam() closure - remove the last hal iT and return the modified String
*/
String.metaClass.halantyam = {
ItRules itRules = ItRules.instance
def varnas = delegate.varnas() as List
if (itRules.hasHalantyam(delegate)) {
varnas.remove(varnas.size()-1)
}
varnas.join()
}
/*
* DSL: tasyaLopah() closure - remove all the it-markers from a pratyaya
*/
String.metaClass.tasyaLopah = {
ItRules itRules = ItRules.instance
itRules.tasyaLopah(delegate)
}
/*
* DSL: Direct exposition of a pratyaya or a pratyahara!
*/
SivaSutra sivaSutra = SivaSutra.instance
sivaSutra.metaClass.getProperty = { String pratyahara ->
def metaProperty = SivaSutra.metaClass.getMetaProperty(pratyahara)
def result
if(metaProperty) {
//if there is an existing property invoke that
result = metaProperty.getProperty(delegate)
} else {
//inspect the property and convert it to varnas
//taparastatkaalasya rule; need to formulate in a better way
if (pratyahara.endsWith('t.')) {
result = (pratyahara - 't.').varnas()
} else {
result = sivaSutra.collect(pratyahara)
}
}
result
}
void testSivaSutra() {
SivaSutra sivaSutra = SivaSutra.instance
sivaSutra.table.each { println it } //print the maheshvara sutrani
println sivaSutra.list //print a flattened version of the maheshvara sutrani
sivaSutra.each { println it } //another way to print flattened maheshvara sutrani
println sivaSutra.itMarkers //print only the it markers
assert sivaSutra.isIt('n.') //check if n. is an it marker
assert sivaSutra.expand('ak') == ['a','e','u','N','r.','l.','k'] //expand pratyahara including the it
assert ['a', 'e', 'u']== sivaSutra.collect('ak') //pratyahara excluding iT
assert ['a', 'e', 'u']== sivaSutra.ak //another way of getting the pratyahara! Meta-programming in play!
}
void testItRules() {
ItRules itRules = ItRules.instance
println itRules.ajanunasika //prints all the ac anunasikas
assert "lyut".varnas() == ['l', 'y', 'u', 't']}
void testHalantyamRule() {
//print the pratyahara-s after the halantyam rule applied
["kt.va", "Gan.", "kt.vat.", "sap", "lyu-t", "saN", "sat.r."].each { println it + " = " + it.halantyam() }
assert 'kt.va' == 'kt.va'.halantyam()
assert 'kt.va' == 'kt.vat.'.halantyam()
assert 'Ga' == 'Gan.'.halantyam()
}
void testTasyaLopahRule() {
["Gan.", "kt.vat.", "sap", "lyu-t", "saN", "satr."].each { println it + " = " + it.tasyaLopah() }
assert 'a' == 'Gan.'.tasyaLopah()
assert 't.va' == 'kt.vat.'.tasyaLopah()
}
void testSamjnaSutras() {
SivaSutra sivaSutra = SivaSutra.instance
def vruddhi = sivaSutra.'At.' + sivaSutra.ic
def guna = sivaSutra.'at.' + sivaSutra.'E.n'
assert ['A', 'i', 'O.'] == vruddhi
assert ['a', 'E.', 'o'] == guna
}
testSivaSutra()
testItRules()
testHalantyamRule()
testTasyaLopahRule()
testSamjnaSutras()
[Listing 4: SimpleScriptScheme.groovy]
package ch8.schemes
/**
* A simple script tokenizer
*
* @author vsrinivasan
*/
class SimpleScriptScheme implements ScriptScheme {
// hyphen denotes anunasika
static List NotationMarkers = ['.', ':', '-']
/**
* split/tokenize a given word into a list of varnas
* the word could be a pada, shabda, pratyaya or pratyahara
* needs to handle anunasika properly
*
* @calledby String.metaClass.varnas()
* @param word
* @return list of varnas
*/
@Override
public List tokenize(String word) {
def varnas = []
word.eachWithIndex { c, i ->
c = ((i < word.length()-1) ? ((word[i+1] in NotationMarkers) ? (c + word[i+1]) : c) : c)
if (!(c in NotationMarkers)) varnas << c
}
varnas
}
}
Now some observations and analysis:
- To do this in a regular Java/C# would require several objects, wrapper-classes and utility methods to be created. But using meta programming techniques and defining a clean DSL makes this a very interesting implementation.
- Ability to work directly on strings, lists and maps makes a huge difference, as opposed to wrappers around strings and creating objects like pratyahara, it, pratyaya etc.
- The Main.groovy is self-explanatory in what's given and what's expected. This is not pseudo-code anymore! Note the direct method invocation like varnas(), halantyam(), tasyaLopah() on Strings. And also observe the direct reference to a pratyaya (sivaSutra.ac will expand to a list of vowels). Metaprogramming, awesome or what?
- Also observe the testSamjnaSutras() definitions. The only reason I have to quote the properties is due to the usage of dot in the schema. A symbol-less scheme like AST would make a very readable code.
- The code uses the SimpleScript for devanagari transliteration. As I had mentioned in a previous post, parsing the script is trivial, because of a strict 1:1 mapping between English and Sanskritam letters. Took less than 5 minutes to write it.
- However the code allows to use any transliteration scheme, if one can come up with it, by implementing the ScriptScheme interface. Harvard-Kyoto, ITRANS or AST or even Unicode - as long as the individual varna-s are correctly tokenized, the program will work fine.
- Any script scheme can be supplied via a groovy configuration and read by ConfigSlurper!
assert "bhavati" = bhU + sap + tin //1st gana
assert "kasca" == 'ka:' + sca //scutva sandhi
Imagine being able to work out sandhis just by using the plus sign! (eg ) Wouldn't that be really really cool? And that's not really impossible. It will only take a little more effort to expand the DSL to include anga, guna, operator overriding for sandhi rules etc.!
Imagine similar DSL-s can be implemented for parsing shlokas to determine chandas! The potential for a Samskritam DSL is huge.
4 comments:
['a', 'e', 'u', 'N'] -> ['a', 'i', 'u', 'N'],
bhavitavyaM khalu?
Oh avagataM - anyA lipiH upayujyate.
@vishvas, aam anyaa lipiH asti | As you can see in the SimpleScriptScheme, it was just a one-liner to tokenize. Didnt want to spend time on tokenizing other schemes which someone have already done. Having said that, its not very difficult to substitute this with a standard scheme like HK, ITRANS or IAST.
nUtanalipi - http://xyzcompany.in/2011/12/15/nutanalipi-new-script/
----------
OM
a A i I u U R RR L LL E Y O W M H
k k' g g' q
c c' j j' Q
T T' D D' N
t t' d d' n
p p' b b' m
y r l v
S S' s h
l; x
ka kA ki kI ku kU kR kRR kL kLL kE kY kO kW k[]M k[]H
akArO muk'aH sarvad'armANAm AdyanutpannatvAt ||
d'armO raxati raxitaH |
agnimILE purOhitaM yajQasya dEvam Rtvijam
hOtAraM ratnad'Atamam
sahasraSIrS'A puruS'aH sahasrAxaH sahasrapAt
sab'UmiM viSvatO vRtvAtyatiS'T'ad daSAqgulam
puruS'a EvEdaM sarvaM yad b'UtaM yacca b'avyam
at'AtO d'Atust'adOS'agatyavikArahEtub'UtArt'avArd'akadravyANyadyAt
Post a Comment