Upe Blog

Sunday, November 29, 2015

UltraESB - Cloning Script

In some cases you may need to build a UltraESB cluster running in a single machine. In that case you should make sure ports that each UltraESB instance acquire does not conflict with others. This script makes sure that a UltraESB installed in a default location (/opt/ultraesb) is cloned into a given number of instances in /opt/uesb1,2,3.. path. To run this, you have to give number of clones you need to have as a parameter.

#!/bin/bash



for step in {1..$1}

do

    mkdir /opt/uesb$step

   

    #Create a clone of UlatraESB

    cp -r /opt/ultraesb/* /opt/uesb$step/

   

    #Change ULTRA_HOME

    sed -i "s/\/opt\/ultraesb/\/opt\/uesb$step/g" /opt/uesb$step/bin/ultraesb-daemon.sh

   

    #Change ram disk path

    sed -i "s/\/tmp\/ram/\/tmp\/ram$step/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Change ram disk overflow path

    sed -i "s/\/tmp\/overflow/\/tmp\/overflow$step/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Change HTTP port

    sed -i "s/property name=\"port\" value=\"8280\"/property name=\"port\" value=\"$((step+8280))\"/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Change HTTPS port

    sed -i "s/property name=\"port\" value=\"8443\"/property name=\"port\" value=\"$((step+8443))\"/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Change host name

    sed -i "s/name=\"nodeName\" value=\"192.168.56.5\"/name=\"nodeName\" value=\"node$step\"/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Change JMX Ports

    sed -i "s/9994/${step}9994/g" /opt/uesb$step/conf/ultra-root.xml

    sed -i "s/1099/1${step}99/g" /opt/uesb$step/conf/ultra-root.xml

   

    #Changing wrapper name

    sed -i "s/wrapper.ntservice.name=UltraESB/wrapper.ntservice.name=uesb$step/g" /opt/uesb$step/conf/wrapper.conf

   

    #Add init scripts

    cd /etc/init.d

    sudo ln -s /opt/uesb$step/bin/ultraesb-daemon.sh uesb$step

   

    sudo chown -R ultraesb:ultraesb /opt/uesb$step/

   

    #sudo service uesb$step start



done

Installing Openstack with Neutron networking using Devstack on Ubuntu 14.04

Devstack [1] is package with a set of scripts to easily install Openstack on Ubuntu environment without worrying about painful configurations while installing Openstack from scratch. In this post I'm describing how to install Openstack in a single machine with networking (neutron) facility using Devstack. This set up consists of installation guide for Openstack's main servers (nova, keystone, cinder, glance and horizon) and additional networking server (neutron).

Prerequisites

Before continuing with installation verify whether following requirements are met in your machine

1. At least 4 threads in your processor

2. 4 GB physical memory

3. Physical network interface with connected to a router and available IPs in that subnet to allocate to VMs. In my case it is 192.168.1.200 in the network 192.168.1.0/24

4. An internet connection. Devstack may need to download required libraries from internet while installation

5. 100 GB free space in the partition (If you need not to create larger vloumes, you can proceed with the installation without that much of space)

Installation steps

1. Clone Devstack from following URL
git clone https://git.openstack.org/openstack-dev/devstack

2. Go to Devstack directory and create a file named local.conf and add following configuration details.

[[local|localrc]]
MULTI_HOST=1
LOGFILE=/opt/stack/logs/stack.sh.log
ADMIN_PASSWORD=123456
DATABASE_PASSWORD=123456
RABBIT_PASSWORD=123456
SERVICE_PASSWORD=123456
SERVICE_TOKEN=xyzpdqlazydog
API_RATE_LIMIT=False

# neutron (networking) configuration
HOST_IP=192.168.1.200 # IP of your Ethernet interface
disable_service n-net
enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service q-l3
enable_service q-meta
enable_service q-metering
Q_USE_SECGROUP=True

FLOATING_RANGE="192.168.1.0/24" # floating (public) IP range of external interface that can be 
                                # used to access VMs from outside
FIXED_RANGE="10.0.0.0/24"       # Fixed IP range that is assigned to VMs for housekeeping 
                                #tasks of Openstack
Q_FLOATING_ALLOCATION_POOL=start=192.168.1.226,end=192.168.1.254

PUBLIC_NETWORK_GATEWAY="192.168.1.1"
Q_L3_ENABLED=True
PUBLIC_INTERFACE=eth0           # Ethernet interface name
Q_USE_PROVIDERNET_FOR_PUBLIC=True
OVS_PHYSICAL_BRIDGE=br-ex
PUBLIC_BRIDGE=br-ex
OVS_BRIDGE_MAPPINGS=public:br-ex

# Optional, to enable tempest configuration as part of DevStack
# enable_service tempest

# cinder volume configuration
# By default cinder creates a LVM partition with a size of 10 GB which limits you to 
# create volumes size of less than 10 GB, If you want to increase this default value, uncomment
# following lines
# VOLUME_GROUP="stack-volumes"
# VOLUME_NAME_PREFIX="volume-"
# VOLUME_BACKING_FILE_SIZE=60250M

3. Run stack.sh to install Openstack using Devstack scripts.

[1] http://docs.openstack.org/developer/devstack/

Monday, September 7, 2015

Profiling UltraESB with YourKit Java Profiler

YourKit Java Profiler is a rich Java profiling tool that can be used to easily identify CPU usage, memory usage, thread utilisation, garbage collections and possible dead locks of your Java applications. In this post I'll briefly go through how to profile an UltraESB instance hosted on a EC2 instance using YourKit Java Profiler.

Before move into further details, you have to download YourKit Java profiler from their site. If you don't have a distribution of UltraESB, you can download a binary distribution of UltraESB from here.

EC2 Setup

Because this is a remote profiling between a EC2 hosted UltraESB instance and your local YourKit application, at the setup of ESC2 instance, ports 10001 - 10010 should be opened for external access.

Figure 1

Both UltraESB distribution and YourKit has to be on EC2 instance. To configure UltraESB with YourKit, JVM_OPTS line of ultraesb.sh in <UltraESB Home>/bin directory should be changed as figure 2.


Figure 2

Path of the libyjpagnet.so file should be changed according to the platform. Once those configurations are done, now you can start UltraESB from

<UltraESB Home>/bin/ultaesb.sh

If configuration is correct, you can see a line like following at the top of ultraesb log.

[YourKit Java Profiler 2015 build 15070] Log file: /home/ubuntu/.yjp/log/java-

1445.log

You can put a load on UltraESB using jb-run tool that is shipped with UltraESB distribution.

cd <UltraESB Home>/bin

./uterm.sh

jbrun -c 100 -d 1 -k -m POST -n 1000 -p /home/ubuntu/payload.txt -s 100 -t 150000 http://localhost:8280/service/echo-back

YourKit (local machine) setup

Start YourKit by running <YourKit Home>/bin/yjp.sh

Click "Connect to remote application". Fill EC2 machine username and domain. (Figure 3)

Figure 3

Add security credentials that you used to log into EC2 instance over ssh (Figure 4)


Figure 4

Then you can see the dashboard of YourKit that describes memory usage statistics, thread utilisation and etc,

To do a CPU profiling, click start CPU profiling button

Sunday, April 19, 2015

Launching simple echo experiment in Apache Airavata

Apache Airavata is a science gateway application that enables managing different scientific computational tasks among computational resources. Here I describe how to register a simple echo application in Airavata and launch it using Airavata API

1. Clone Airavata source https://github.com/apache/airavata/tree/0.14_release and build

2. Start Airavata server as given in https://cwiki.apache.org/confluence/display/AIRAVATA/XBAYA+Quick-Start+Tutorial

3. Run Sample class to register echo application
https://github.com/apache/airavata/blob/0.14_release/airavata-api/airavata-client-sdks/java-client-samples/src/main/java/org/apache/airavata/client/samples/RegisterSampleData.java

4. Run following command to create and run the experiment

Airavata.Client airavataClient = AiravataClientFactory.createAiravataClient("127.0.0.1", 8930);

String appId = "Echo_e82aa96b-66ea-4f31-97e7-1182a32e55d2";

List exInputs = new ArrayList();
InputDataObjectType input = new InputDataObjectType();
input.setName("Input_to_Echo");
input.setType(DataType.STRING);
input.setValue("Echoed_Output=Hello World");
exInputs.add(input);

List exOut = new ArrayList();
OutputDataObjectType output = new OutputDataObjectType();
output.setName("Echoed_Output");
output.setType(DataType.STRING);
output.setValue("");
exOut.add(output);

Experiment simpleExperiment = ExperimentModelUtil.createSimpleExperiment("default", "admin", "echoExperiment", "Echo Exp", appId, exInputs);
simpleExperiment.setExperimentOutputs(exOut);

Map computeResources = getClient().getAvailableAppInterfaceComputeResources(appId);
String id = computeResources.keySet().iterator().next();
String resourceName = computeResources.get(id);
System.out.println(computeResources.size());
System.out.println(id);
System.out.println(resourceName);
ComputationalResourceScheduling scheduling = ExperimentModelUtil.createComputationResourceScheduling(id, 1, 1, 1, "normal", 30, 0, 1, "sds128");
UserConfigurationData userConfigurationData = new UserConfigurationData();
userConfigurationData.setAiravataAutoSchedule(false);
userConfigurationData.setOverrideManualScheduledParams(false);
userConfigurationData.setComputationalResourceScheduling(scheduling);
simpleExperiment.setUserConfigurationData(userConfigurationData);

String exp = airavataClient.createExperiment(simpleExperiment);
airavataClient.launchExperiment(exp,"sample");

Tuesday, April 14, 2015

SinMin - Sinhala Corpus Project

We (Dimuthu Upeksha, Chamila Wijayarathna, Maduranga Siriwardan, Lahiru Lasadun) started SinMin - Sinhala Corpus Project as final year undergraduates for our final year project under the supervision of Dr. Chinthana Wimalasuriya, Mr. N. H. N. D. de Silva and Prof Gihan Dias.

A rich language corpus enables a wide area of research topics for a language. Most of them include

1. Statistical analysis of the language usage pattens
2. Translators
3. Spell and Grammar Tools
4. Backend support to third party applications like OCR tools

Usually a corpus contains a collection of authentic texts of the language. However rather than storing them as raw text files, Sinmin further stores them in different databases with different schemas. This enables Sinmin to easily process language data in realtime. In addition to that SinMin Corpus contains a REST API that enable querying and finding data through third party applications.

SinMin web interface provides ability to illustrate and find patterns that occur in Sinhala Language over different time periods and different categories.

Useful Links

SinMin Web : http://sinhala-corpus.projects.uom.lk/sinmin-web
Crawled raw data files in XML format : http://sinhala-corpus.projects.uom.lk/sinmin-web/data
Documentation : http://sinhala-corpus.projects.uom.lk/docs
API documentation : http://sinhala-corpus.projects.uom.lk/docs/display/ds/REST+API
Source Code : https://github.com/sinmin/core
Additional Resources

Chamila's blog post : http://cdwijayarathna.blogspot.com/2015/04/sinmin-corpus-for-sinhala-language.html

SinMin Sinhala Corpus currently crawl data from following sources

Sinhala Online Newspapers

Lankadeepa - http://lankadeepa.lk/
Divaina - http://www.divaina.com/
Dinamina - http://www.dinamina.lk/2014/06/26/
Lakbima - http://www.lakbima.lk/
Mawbima - http://www.mawbima.lk/
Rawaya - http://ravaya.lk/
Silumina - http://www.silumina.lk/

Sinhala News Sites

Ada Derana - http://sinhala.adaderana.lk/
Sinhala Religious and Educational Magazines
Aloka Udapadi - http://www.lakehouse.lk/alokoudapadi/
Budusarana - http://www.lakehouse.lk/budusarana/
Namaskara - http://namaskara.lk/
Sarasawiya - http://sarasaviya.lk/
Vidusara - http://www.vidusara.com/
Wijeya - http://www.wijeya.lk/

Sri Lanka Gazette in Sinhala - http://documents.gov.lk/gazette/
Online Mahawansaya - http://mahamegha.lk/mahawansa/
Sinhala Movie Subtitles - http://www.baiscopelk.com/category/සිංහල-උපසිරැස/
Sinhala Wikipedia - http://si.wikipedia.org/
Sinhala Blogs

Wednesday, October 8, 2014

Java Wrapper for Tesseract OCR Library

Tesseract is a very popular OCR library written in C++. It can be simply used to identify characters in a given image that contains text. In addition to that it can be used to get positions of each word/ character. Tesseract provides a command line tool and a C++ api to give services to users. However there is not a implementation for Java users that can directly use Tesseract for their applications.

As a part of my GSoC project in Apache PDFBox I implemented a Java wrapper for Tesseract C++ api that can be used by Java users to directly use Tesseract in their applications. Code repository can be found from here.

To use Java API simply import Tesseract-JNI-Wrapper-1.0.0.jar to your project. If you are using maven, add this to your pom

<dependency>
  <groupId>org.apache.pdfbox.ocr</groupId>
  <artifactId>Tesseract-JNI-Wrapper</artifactId>
  <name>Tesseract Jni Wrapper</name>
  <version>1.0.0</version>
</dependency>

Here is a sample code that can use Java API invoke Tesseract.

public String getOCRText(BufferedImage image){ //You need to send BufferedImage (RGB) of scanned image
  TessBaseAPI api = new TessBaseAPI();
  boolean init = api.init("src/main/resources/data", "eng"); // position of Training data files
  api.setBufferedImage(image);
  String text = api.getUTF8Text();
  System.out.println(text);
  api.end();
  return text;
}

Getting positions of each OCRed word

public void printOCRTextPositions(BufferedImage image){
  TessBaseAPI api = new TessBaseAPI();
  boolean init = api.init("src/main/resources/data", "eng");
  api.setBufferedImage(image);
  api.getResultIterator();
  if (api.isResultIteratorAvailable()) {
    do {
      System.out.println(api.getWord().trim());
      String result = api.getBoundingBox();
      System.out.println(result);
    } while (api.resultIteratorNext());
  }
  api.end();
}

P.S.
This wrapper currently is working in MacOS and Linux environments. It wasn't tested in Windows environments. If anyone is willing to develop or improve functionalities of this wrapper please let me know.

Tuesday, October 7, 2014

Continuous Integration for GitHub - Travis CI

Travis CI is a very impressive and cool CI tool that can directly fetch and automatically build your GitHub projects. Following few steps you can easily integrate your GitHub projects with Travis CI

1. Got to https://travis-ci.org/ and log in using your GitHub account

2. click + button and add your project to Travis CI

3. Add .travis.yml file to the root folder of the project and push it to GitHub
This is the file that contains configuration details to Travis CI about your project details like language and build instructions
If your project is a java maven project, you can simply add

language: java

install: mvn install -Dmaven.compiler.target=1.6 -Dmaven.compiler.source=1.6 -DskipTests=true

script: mvn test -Dmaven.compiler.target=1.6 -Dmaven.compiler.source=1.6

For more configuration details refer to the documentation of Travis CI

4. Do some change to your project and push it to GitHub. Commit will be reflected in Travis Console same time and it will start to build project automatically and send build details to your mail.