vAgartha: - वाच: अर्थ: |: programming

Showing posts with label programming. Show all posts

Tuesday, November 15, 2011

Read, Restore and so forth

My first sight of a computer was in 1983 in a remote town in India, the deity of the city is a representation of "Conscious-Ethereal Grand Cosmic Nothingness". Our science teacher somehow got hold of somebody who had a Commodore 64. About 40 students from our class (India was that less populous 30 years ago) walked about 5 kilometers on a rainy day to that computer guy's house. We were allowed in a batch of 10 into a room dimly lit and were seated on the floor. A girl, sitting on the chair, was holding a joystick (or a mouse?) and a keyboard and making a noisy typing sound. On the small monitor some rectangles and squares of different colors were jumping around. She was playing some game. She said something about BASIC and thats all we learnt.

Almost 10 years forward. It was the onset of the Russian winter, I was walking with a senior towards the university. He was a smart guy, everybody respected him and was always an A-grader. We were talking about programming language theories. C++ was just getting popular. He said "Hey, I know Pascal and C. And this year we are learning some AI using Prolog. I've also been learning C++...". He paused. Then suddenly said, "You know BASIC right? Can you teach me that?". I didn't know what to respond, but just said "Sure". I was a bit confused but elated to 'teach' a senior. That opportunity never came though.

Current times. Studying Ashtadhyayi's several techniques which are an illuminating parallel to programming - there is one that is intriguing. It is the word "aadi" given in a context. When Panini wants to mention a group of information, he would just use the first value of the group and suffix it with "aadi" or "aadya:". The reader is obviously either expected to know the list by-heart or refer to it. No big deal, when the average Sanskrit student is expected to know amarakosha by-heart anyway. So the first value of the list itself is used as the "head" to reference the list. This way Panini feeds by a pointer to an array of data using a very simple technique.

A pseudo code may clarify:

/* The list of verbs called as dhaatu paatha */
static Map DHAATU_PAATHA = [bhU:sattaayaam, ... ]

/* pointer to the list of the dhaatu paatha; trying to mimick naturalness - intentionally not referring via the static variable but via the head-value of the list */
char *list_of_verbs = ["bhu"]

Look at some of the sutra-s -

bhUvAdayo dhaatavaH (1.3.1) | By this statement Panini refers to about 2000+ verbal roots in Sanskritam, starting with bhU
sanaadyantaa dhaatavaH (3.1.32) | Refers to the list of derivational roots, the list starts with a verb that ends with suffix "-san"
praadayaH | Refers to the 22 prepositions that start with "pra"

Obviously this technique of "aadi" reference is pretty common in Sanskritam and other Indian literature. Tyagaraja in his siddharanjani kriti naadatanumanisham says "sadyojaataadi pancha-vaktra" referring to the five faces of Shiva starting from sadyoja. Obviously one who is not aware of the details will not know what the rest are, but aadi is just what it is - a pointer to a list of information. If Panini was the one who invented it (lets assume for sake of argument, because Panini had predecessor grammarians too and there were obviously other literature before him), it is a brilliant technique. The technique is not perfect though, because overtime somebody could come up with a modified list with the head-value being the same. But still its a great way to abstract information where the uniqueness of the head-value serves as an emphasised indicator to the contents following it.

Back to programming after the detour. Even after several years in programming, BASIC continues to fascinate me. Given all kinds of high level languages, there is one feature I think I sorely miss from BASIC. It is the "READ...DATA" statement. The READ...DATA statement allows for feeding data to the program in the shortest possible way without having to assign random values individually.

10 FOR I = 1 TO 10: READ X(I): NEXT I
20 DATA 1,3,5,7,11,13,17,19,23,29
30 RESTORE 20

10 READ NAME$, PHONE$, PI, BASERADIX
20 DATA "James Bond", "555-1212", 3.14, 8
30 DATA "11/11/2011", "All the world's a Pre-Production."
50 READ DATE$, WS_QUOTE$

The DATA statement could be anywhere in the program and the READ statement would sequentially read-off the data, like popping off a stack. The RESTORE statement acts like just like the "aadi" of ashtadhyayi - it points to just the beginning of the data. The simplicity of the bootstrap data feed is appreciated when you do not care where the DATA is set. Several high level languages have been invented after that, but not many provide such an easy way to feed bootstrap data to the program variables. Of course there is enumerators and similar stuff, but somehow the simplicity of READ statement stands out. Just like Panini's aadi technique.

Tuesday, August 9, 2011

Six degrees of Sutras

One of the memorable humour tracks of Tamil movies is from arivAli (The Intelligent) made in early 1960s. A black white film, with simple clean humour. Very likely such movies were made in other Indian languages too. The husband asks his wife to make puri. Wife does not know how to make it, so husband gives instructions to her:

Husband: Take a vessel and put some wheat flour in it
Wife: Yeah, I know that!
Husband: Pour some water and salt and mix with flour
Wife: Yeah, I know that!
Husband: Make it into small round balls
Wife: Yeah, I know that!
Husband: Flatten it and make it round like appalam
Wife: Yeah, I know that!
Husband: Then what?
Wife: Aah, I dont know that...

In an earlier post we saw how a sutram's definition can be applied to naming convention of variables in modern programming context. A natural question arises - what are the types of sutram-s? As I have mentioned before, classification is ancient Indians' forte. There are six types of sutram-s which is defined, yeah you guessed it, in a shloka:

संज्ञा च परिभाषा च विधि: नियम एव च ।
अतिदेशो अधिकारश्च षड्विधम् सूत्र-लक्षणम् ॥

A lot has been described about this classification in wiki articles and elsewhere. Here we will just look at the parallels between this classification and programming concepts. संज्ञा - definition; परिभाषा - interpretation; विधि- rules; नियम - restriction; अतिदेश- extension; अधिकार - header/domain.

संज्ञा is a definition sutram. It gives a meaningful name for one or more symbols. Do you remember your early programming days where you are told that a good practice is to give a name for constants? (Do you still follow that?)

वृद्धिरादैच् | [1.1.1] {a, e, o} are called vRuddhi |

In programming terms, this is like the classic C #define statement or final static/const statements in Java/C# world.

#define PI 3.14
#define IK_PRATYAHARA {i, u, Ru, Lu}

परिभाषा sutram is an interpretation or meta-rules sutram. Its function is to tell how to interpret other sutram-s.

तस्मात् इति उत्तरस्य | [1.1.67] When a sutra has a word in panchamI vibhakti, then the word next to it will undergo some modification.

A very good equivalent to interpretation rules is Annotations. This example will make it clear:

@WebMethod(POST)

public void updateUser() {

}

The annotation says that the method updateUser() must respond to only a http post call. It helps the runtime interpret the method in a certain way.

विधि sutram: What is the fundamental difference between a calculator and a computer? The calculator (a regular) deals with numbers only, while the computer can make logical decisions.

विधि sutram is the classic if-else-condition. In fact, the way पाणिनि has applied it - its closer to aspect-based rules than just a vanilla if-else condition. There is a slight difference between "if do this" and "when apply this". While the if-condition has to be encountered during the execution process thread, a when-rule applies anytime during the happening of a certain condition. "if" is thread-execution based, while "when" is time based, a trigger with an "if".पाणिनि defines several rules in his अष्टाध्यायी. For example there is a famous sandhi rule with just three words, that covers a matrix of sandhi-s.

इको यण् अचि | [6.1.78] When a vowel follows, the letters i, u, Ru, Lu change to y v r l.

Obviously any if-else condition would be a vidhi sutram. If we were to follow paaNini's technique, we would not write it as a simple if condition. Instead we would define all "when rules" in a separate section of the code, and provide aspects of applying. When code executes, the aspects will monitor the code and "apply" a rule when those conditions are satisfied. For eg the (now) Oracle's Haley is one of the very popular Rules Engine products, in which Rules can be defined in simple English.

नियम sutram
If programming was based on socialistic ideologies, rules would apply uniformly to all cases (except the political class, of course). But reality is somewhat capitalistic, so there is always some case which disagrees to agree. Remember the business requirements like "should apply to all but one or two cases" and the complex if-conditions you would have to write to handle just the boundaries?

Restriction is not Exception. Exception is an alternative flow, but restriction is about applying a rule to a lesser number of or rarer cases. In general, it is achieved by if-else conditions, but doesn't always have to be so.

अतिदेश sutram-s are extensions. They qualify a pre-existing rule with another property, not originally possessed.

Using pratyaya-s vat and mat, paaNini extends the behaviour of one rule and applies to another rule.

लोट: लङ्वत् | [] would mean "लोट् vibhakti-s are to be conjugated just like लङ्"

Imagine that लोट्, लङ् etc implement an interface called "ल". Panini's technique essentially casts लोट् as लङ्.

Class extensions are very common in OO languages. Using the Extension mechanism offered by Ruby, C# etc, one can improve the functionality of an existing class. For eg, String class does not have intrinsic methods to check if the value is null or empty . One could add a IsNullOrEmpty(this String) method and operate directly on a string instead of a new helper class.

It does not end there. Imagine two objects User and LockedUser:

User user; //normal user behaviour
LockedUser lockedUser; //a locked user behaviour, imagine a Trait (see Scala which allows partially implemented interfaces)
if (user.failedAttempts(3)) { user.setBehaviour(lockedUser)); } //user now behaves like a lockedUser

Instead of changing the state of the object (commonly via isLocked() or status = LOCKED), the behaviour of the object itself is changed. Upon certain conditions, the regular user adapts the behaviour of a locked user. We do not directly deal with properties and state changes, instead work with behaviour changes. For eg, Scala offers Traits, with partially implementable methods, which are much more than interfaces. In a way you are describing or coding to the behaviour of an object rather than the states.

अधिकार sutram: If you have done database modeling, you know Subject Areas. In programming terms, think of package, namespace etc. They all define a domain underwhich certain rules/classes/tables are grouped. Thats exactly what adhikara sutram is. More information on adhikara sutram can be found in this post.

संहितायाम् | [] "During the closeness of words"
प्रत्यय: | [3.1.1] "Affix"

package com.microsoft.office;
namespace Com.Sun.Oracle.Java;
//ok, I was being facetious :-)

Apart from these core sutram-s there are a few more.

निषेध sutram-s are negation rules of other rules. While niyama sutram-s are positive restrictions, nishedha can be seen as negative orientation.

हलन्त्यम् | (consonant endings are it-markers)
न विभक्तौ तुस्माः | (but letters t, th, d, dh, s, m are not it-markers if it is used for conjugations)

In programming, we can come up with a validation rules engine like this:

1. All Address Fields Required
2. Not if Address Line 2

Imagine the simplicity of a program using such a validation engine!

विभाषा sutram is an optional rule. For eg think of the sentence "I would like to goto a movie". You can also say it as "I'd like to goto a movie". The shortening of "I would" to I'd does not change the meaning, and is optional to use. पाणिनि uses this technique to provide alternative usages of words and grammar by using the shabda-s vaa, vibhaaShaa, anyatarasyaam. Optional rules are very common and are done using if conditions.

Besides these, some of the paaNini's techniques also strike a chord with modern techniques. For eg, there is an interpretation sutra विप्रतिषेधे परम् कार्यम्, which means "In case of rule-conflicts, the latter rule prevails". Virtual override feature?

Another unique technique is called स्थानी भाव where a substituting suffix can retain the characteristics of a substituted suffix. Heard of Liskov Substitution Principle? Yeah, something like that.

Panini also uses recursive techniques for some of the rule operations. We will see that in a subsequent post.

Yet another ingenious technique is seen in the last 3 pada-s of अष्टाध्यायी where every previous rule is oblivious to all the latter rule. The rules are arranged in such a way that every rule "thinks" that it is the last rule of the book.

So what is the benefit of comparing modern programming with a 2500+ old text book of grammar? Let it be पाणिनि:, Capellini or Linguini founded algorithms. As a software engineer, what do I care? I do not have an answer. After all, a programmer consultant in US writes a for-loop for $75 an hour, while the same for loop is written by some one in China for a few Yen. Can you judge which for-loop is better?

If several of modern programming concepts have indeed parallels in अष्टाध्यायी, how about some concepts in अष्टाध्यायी not yet formulated into modern programming theories? What if they could create a fundamental change in theory of programming?

What would पाणिनि: think about modern programming concepts? To find out, we shall send Donald Knuth back in time as our representative.

Knuth: Programming is about definitions, rules and algorithms.
पाणिनि: आम्, जानाम्येव !
Knuth: Using algorithms we derive and solve various equations.
पाणिनि: आम्, जानाम्येव !
Knuth: In object oriented programming, we do abstraction, polymorphism and other cool things.
पाणिनि: आम्, जानाम्येव !
Knuth: With functional programming, we define functions, recursions and closures.
पाणिनि: आम्, जानाम्येव !
Knuth: Then we create programs to play games like Grand Theft Auto all day long.
पाणिनि: अहो ! तदहम् न जानामि !!

Wednesday, December 9, 2009

From raama to raamaha

In a previous post we saw how a programming language can be written effectively in a natural language using Paninian sutra style.

In this post, let us do the exact opposite: converting a set of Panini-sutras to programming language style syntax to understand the rules of Sanskritam. The aim of the post is to kindle interest in Sanskrita studies for a typical software engineer to see the parallels of concepts of programming in Paninis methods. Statements and researches on Panini's methodology and how it is close to programming can be found plenty googling around. But as the ancient saying goes the proof of code is in the compiling. Of course I wont be delivering a code here, but hopefully a pseudo-code should convince any software engineer. Much of the pseudo-code can be polished and implemented in languages like Groovy/Ruby that supports expando, reflection etc.

Beginner Sanskrita students are often confused between rAma and rAma: (pronounced raamaha). Why do we add a visarga? Does the addition of visarga change the meaning?

The short answer is: Yes, the addition of visarga does add meaning.

In non-inflexional language like English, the prepositions provide the purpose of the noun. For eg by Rama, to Rama, from Rama, in Rama: in these cases "Rama" stays constant, while the prepositions provide the notion. Hindi also exhibits non-inflexional properties (rAm ne, rAm ko, rAm se, rAm par). In inflexional languages, the noun itself is modified to denote the purpose.

So what does this have to do with rAma: ? The word rAma: can be split into rAma + visarga. Here rAma is called the "stem" or "nominal stem". In Sanskrita its called prAtipadikam (प्रातिपदिकम्). prAtipadikam is defined as "arthavat adhAtu apratyaya" (अर्थवत् अधातु अप्रत्यय प्रातिपदिकम्) -- that which has meaning, and not a root, and not a suffix is called prAtipadikam. This stem will undergo modifications (inflexions) to fulfill the purpose of the noun.

Panini provides the methodology of modifying stem "rAma" to "rAma:" in a few sutras.

su aujasamauTChasTAbhyAmbhis~gebhyAmbhyas~gasibhyAmbhyas~gasosAm~gyossup
upadeSe ajanunAsikA it
sa sajuSho ru:
kharAvasanayo: visarjanIya:

The same in devanAgarI:

सु औजसमौट्छस्टाभ्याम्भिस्ङेभ्याम्भ्यस्ङसिभ्याम्भ्यस्ङसोसाम्ङ्योस्सुप्
उपदेशे अजनुनासिका इत्
स सजुषो रु:
खरावसनयो: विसर्जनीय:

Forget the tongue-wrecking, memory-bending first sutra for now. We will see its utility in the future posts.

Lets do some pseudo-code now.

//purpose of the noun - what do we want? singular/plural, masculine/feminine etc.
def purpose
//the stem to use, based on the purpose, this stem will now change
def stem = "rAma"
//anunAsika vowels for #2 (using single quotes to denote nasalization)
def nasalVowels { a', A', i', I', u', U', R', R.', e', ai', o', au' }

//Requirement: create a nominative-singular-masculine form of rAma -- prathamA vibhakti, ekavachanam, pumlinga from prAtipadikam rAma

def create_nominative_singular_ masculine_noun_from_stem(stem) {

if (purpose.isMasculine()  && purpose.isSingular() && purpose.isNominative()) stem.append("su'") //sutra #1: rAma -> rAmasu'
if (stem.endsWith(nasalVowel)) stem.removeLast(nasalVowel) //sutra #2: rAmasu' -> rAmas
if (stem.endsWith("s")) stem.replace("s", "ru'") //sutra #3: rAmas -> rAmaru'
if (stem.endsWith(nasalVowel)) stem.removeLast(nasalVowel) //sutra #2: rAmaru' -> rAmar
if (stem.endsWithAnyOf(KHAR)) stem.replace(stem.findLast(KHAR), ":")) //sutra #4
return stem //rAma:
}

Following the algorithmic steps, when the intention of one rAma is to be in nominative case (or as a subject), the end result is that a visarga is appended. Just 'rAma' does not denote anybody. "rAma:" denotes one masculine person in nominative/subject form.

Naturally, a question arises - Why dont we just add a visarga at the end instead of going thru all these rules? Note that this visarga is only for a masculine form. For neuter and feminine nouns, a su' will be added, but other rules from prevent them into morphing into a visarga. So Panini adds a common suffix and specifies rules on how it is applied in various situations.

In the next post, let us look at making the above method efficient.

Thursday, November 19, 2009

Sutra based Programming

Haven't software engineers had enough of programming languages? C and C++ were there for a while unchallenged. Then Java took over C/C++ pretty quickly. Even though you could create complex applications using Java and run on any platform, there is no dearth of new languages. Scala, Groovy, Ruby and very recently Go from Google.

All high-level programming languages do 4 basic things:

Assignment
Condition
Loop
Function/Procedure.

Loop and Function/Procedure/Subroutine are essentially glorified Goto.

Everything else is syntactic sugar.

Galileo has influenced us heavily to think math in terms of symbols. A mathematical symbol expresses an idea very concisely and effectively. Much more than a language could do.

So

x = 1

is more concise than saying "let x equal to 1".

Let me rephrase the above bereft of assumptions --

The mathematical expression "x = 1" is more concise than saying it in English "let x equal to 1".

But what if we could express "x = 1" much more concise in a natural language?

For e.g., let me just say

x 1

I just dropped the symbol "=". One symbol less! Whohoo! I also established a convention that whatever on left side is receiver and on the right side is provider. In a language like Sanskrit, for e.g., the "is/happens" is implicit (asti/bhavati). So no other verb is required. (There are other languages that exhibit this property too).

People think that just because ancient Sanskrit mathematicians did not use symbols, the works are not scientific. Symbol is just a convenience; if language could be more powerful than symbols, who needs them? As a side note, Sanskrita almost does not use punctuations (except for end of sentence - the pipeline character |). In contrast, English just can't be "communicated" without appropriate punctuations' usage.

To demonstrate the power of the language, let us look at a simple program using a pseudo-language.

//Pseudo-code to produce a random number and determine if its odd or even
s = 100;
r = (int) rand(s);
if (r % 2 == 0)
return "even";
else
return "odd";

The question is, can this program be expressed using natural language? Of course, we can write the whole program in plain English, but it wont be concise. (COBOL anyone, hello?)

Now lets apply the Sanskrit grammarian Paninian rules. I am going to keep the function names as is (in bold), but conjugate the variable names per Sanskrit rules (taking them to be consonant ending variables).


1: aSeSha: SUnyam  // No remainder 0 (No remainder equals 0)
2: s Satam         // s Satam asti (s is 100)
3: sa: rand r      // of s random is r (r is random of s)
4: int ca          // int also (r is also int of random of s)
5: even ra: aSeshe dvibhAjane  // "even" is of r during division of 2 when no reminder happens (on division by 2 of r has no reminder, it is even)
6: odd SeShe       // "odd" when there is a reminder

Here are the sutras in proper Sanskrit:

अशेष: शून्यम् |
स् शतम् |
स: वृथा र् |
अभिन्न: च |
समं र: अशेषे द्विभाजने |
विषमं शेषे ||

As you can see there are absolutely no mathematical symbols! A program is written purely by the expressive power of language. So what happened? How are they equal?

Using nominative case and the implicit "is", Panini eliminated the need for equals sign. Functions are defined via genetive case. Using the locative case, Panini provides the if-else condition. In effect, mathematical expressions are substituted by simply conjugating the variables.

That is the genius contribution of Panini to Sanskrita! Now a skilled poet could come and rearrange the above 6 sutras into a sloka format, and lo! there is sloka that tells us how to determine an even/odd number!

Let me attempt a half-baked sloka (May Sanskrita enthusiasts forgive me for such a blasphemy).

अशेषो शून्यं भूयात् सेकशतं वृथा रेफ: स: ।
अभिन्नश्च द्विभाजने विषमं समं शेषोऽशेषे ॥

Due to the occurrence of words like SeSha, SUnyam, aSeSha, dvibhAjane, abhinna the above can be mis-interpreted to refer Adisesha, SUnyavAda, Vishnu, Dvaita, Advaita etc. Now we have an example of a sloka referring to the gods and a mathematical algorithm encoded in it!

Tuesday, October 27, 2009

A Sutra for Naming Conventions

Many have heard of Patanjali's yoga sutra, Brahma sutra, Panini sutra etc. Literally it just means 'a thread'. But what is it really? What qualifies as a sutra? When does a sentence become a sutra?

It turns out, there is a definition of sutra. A sutra must exhibhit all 6 characterists to be called so. What are they? As usual the Sanskrit grammarians have come up with a verse that defines a sutra.

alpAkSharam asandigdham sAravat vishvato mukham |
astobham anavadyam cha sUtra: sUtravido vidu: ||

अल्पाक्षरम् असन्दिग्धम् सारवत् विश्वतो मुखम् ।
अस्तोभम् अनवद्यम् च सूत्र: सूत्रविदो विदु: ॥

Source: vaayu puraaNa (anytime before 500 BC)

alpAksharam - Concise
asandigdham - without any doubt ie unambiguous or should have a singular meaning that is conveyed
sAravat - meaningful, ie should not contain gibberish
vishvatomukham - Properly applicable
astobham - devoid of 'stobha' (kind of fillers in Vedic chanting) like hA hU
anavadyam - irrefutable (na avadyam - that which cannot be refuted)

people who know a sutra, know it so.

Now, how does this apply to a programming style?

One of the hardest thing to do in software development is to understand others' code. Every developer would have come across some body else code and claimed it as 'the ugliest piece of code ever seen in life'.

What is an ugly code actually? In general, a hard to understand, spaghetti type code can be considered an ugly code. Typically lot of confusion arises from what the developer is trying to convey by means of the names of variables, classes, methods etc. A novice developer names a variable based on what he or she thinks. An experienced developer names a variable as how a novice would understand it without effort. In general the Shakespearean quote "Whats in a name?" just does not apply to programming. A rose may smell the same even if its called dog-poop, but its definitely a code-smell if naming conventions are poor.

Lets see how each characteristic of this ancient definition sutra applies to naming convention.

alpAksharam: Must be concise.
For eg. age, firstName, addressLine1 etc.

asandigdham: Must be unambiguous.
Eg. temp: What does it denote? A temporary variable? temperature? template?
Eg. Either use 'login' or 'logon' everywhere, but do not mix.
Eg. getReg(): What does it return? Registration? Registry? Regular Expression?
Eg. code, date: What kind of code? What kind of date?

sAravat: Pithy; Meaningful; Should not contain gibberish
eg. clr; tmpk; fru; stp, lzp. Combining this with the alpAksharam and asandigdham rules - will give a proper meaningful name.

vishvatomukham: Properly applicable
For eg, a variable name must have a proper scope. Eg. avoid local method variables having same name as member variables.

astobham: Devoid of unnecessary characters.
Bad eg: intx, a_b_c; Believe me, there are programmers who do this just to confuse others.

anavadyam: Flawless; Irrefutable
A naming of a variable or a class must describe exactly what it says. Another developer should not be given a chance to say "Why didn't you name this differently such that it is understandable?"

vAgartha: - वाच: अर्थ: |