Software Portability And Optimization: March 2018

Thursday, 29 March 2018

SPO600 - Phase 2 - Compiling Optimizations And Benchmarking

This post is going to be pretty long as I will be going through following stages below

Setting up md5deep in x86_64 and running multiple tests of md5deep
Bench-marking of md5deep x86_64 without optimization and with optimization
Bench-marking of md5deep on AArch64 with optimization as I have already bench-marked md5deep on Aarch64 without optimization in SPO600 - Stage 1 blog

So setting up md5deep on x86_64 would be same as we set it up on Aarch64 but I will be going through it real quick

Download md5deep package(tar.gz) from https://github.com/jessek/hashdeep/releases/tag/release-4.4
Using scp or sftp transfer that tar.gz file to x86_64 architecture
Unpack that package using tar -xvzf name of the file command
Enter into unpacked folder and start typing following commands
sh bootstrap.sh
./configure
make install
Generate multiple files with some random data in it for testing purposes by typing following command keep count same for the first time but then change file name and cout to 10 and 100 respectively this will create 3 files with size of 10mb,105mb and 1.0gb. dd if=/dev/urandom of=hello2.txt bs=1048576 count=1000
time for i in {1..100}; do ./md5deep hello.txt; done by typing this command md5deep is gonna run your file 100 times and will give total time to execution and run the whole program but we have to take average by dividing by 100 of it so that we can get time of each run-time

(After taking average)

(10MB)
real    0m0.033s
user    0m0.028s
sys     0m0.007s

(105MB)
real    0m0.309s
user    0m0.284s
sys     0m0.037s

(1.0GB)
real    0m3.727s
user    0m3.556s
sys     0m0.458s

Before taking average

(10MB)
real    0m3.303s
user    0m2.875s
sys     0m0.742s

(105MB)
real    0m30.971s
user    0m28.934s
sys     0m3.921s

(1.0GB)
real    5m13.540s
user    4m55.446s
sys     0m46.945s

Test above are without optimization running on O0 by default. This command will help change all optimization flags

find -name Makefile | xargs sed -i 's/-O0/-O2/'

just to make sure your changes have applied just check anyone of the Makefiles and you would see O2 wherever O0 is there

-O2 flag
10 mb file        105 mb file    1 gb file
real: 0m0.041s    real: 0m0.387s    real: 0m4.48s
user: 0m0.011s    user: 0m0.250s    user: 0m3.73s
sys: 0m0.005s    sys: 0m0.015s    sys: 0m0.84s

-O3 flag
10 mb file    105 mb file    1 gb file
real: 0m0.035s    real: 0m0.158s    real: 0m2.071s
user: 0m0.010s    user: 0m0.97s    user: 0m1.756s
sys: 0m0.003s    sys: 0m0.014s    sys: 0m0.77s

By comparing the real time you can see the -O3 flag makes the program faster. Which is a pretty nice optimization. So I recommend they replace the -O2 flag with -O3.

Now I will be bench-marking and compiling optimizations in Aarch64

-O2 flag
10.5 mb file    105 mb file    1.5 gb file
real: 0m0.032s    real: 0m0.355s    real: 0m4.892s
user: 0m0.025s    user: 0m0.333s    user: 0m4.651s
sys: 0m0.001s    sys: 0m0.028s    sys: 0m0.500s

-O3 flag
10.5 mb file    105 mb file    1.5 gb file
real: 0m0.035s    real: 0m0.333s    real: 0m4.668s
user: 0m0.025s    user: 0m0.325s    user: 0m4.399s
sys: 0m0.001s    sys: 0m0.027s    sys: 0m0.326s

Finally done with optimization of md5deep on aarch64 with O2 and O3 it didn't gave expected results. So from the bench-marking and optimizations applied on Aarch64 and x86_64 we can say that x86_64 has perfectly given expected results by properly having compatibility with optimization flags like O2 and O3. I would surely recommend to use O3 flags in x86_64 architecture while executing md5deep program. But G Profiling really helped finding functions which were giving calls to other functions for example: multihash_update(unsigned char const*, unsigned int)
void hash_final_sha1(void * ctx, unsigned char *sum) { }but it seemed pretty hard to make any changes. But I guess changing optimization flags was an alternate option which worked overall so my phase 3 for the project will be based on up streaming my optimization bench-marking

SPO 600 - Phase 2 - G Profiling

Phase 2 for the project will be pretty long so I decided to break down evenly in multiple stages. This stage is about G Profiling. What is Profiling? Profiling is he process of determining how a program is using the resources that it is consuming. Profiling produces a clear view of the call graph -- the hierarchy of function/procedure/method calls that takes place during the execution of the program.

Resources consumption that can be analyzed during profiling include:

    Time (both clock time (total real time, user time, and the amount of time the   kernel spent on behalf of the program)
    Memory
    Temporary storage
    Energy (this is a relatively new area of profiling)

There are several profiling tools available. Open Source options include

    gprof
    perf
    oprofile
    SystemTap These tools provide different combinations of profiling capabilities, and may provide additional functions. But here we will be deeply exploring gprof like how we will be using gprof in Aarch64 system with md5deep? and how it could be helpful for bench-marking and optimization?

So our first step would be build the software to be profiled using -pg(profile generation) option to enable profiling during compilation or execution of our md5deep command.In order to do the previous step first we need to modify the makefile or other build instructions, but it can often be done using CFLAGS or CCOPTS variable. We have 2 ways of modifying the make file for enabling profiling by addding -pg option.

(1) Manually change each and every make file in the folder where it is located after extraction. There can be multiple make files
find . -name "Makefile" -> Enter this command in that folder to locate all the Makefiles. For me there were 6 Makefiles I had to modify each one of it which was stressful NOTE: Depending on what version of md5deep you have installed makefiles vary from version to version. These are my makefiles

./doc/Makefile
./man/Makefile
./src/Makefile
./tests/testfiles/Makefile
./tests/Makefile
./Makefile

At the end of each Makefile you will see something similar to this

world:
        @echo meta-build system.
        @echo Making both Linux and Windows distributions
        make distclean
        ./configure CFLAGS="-Wall -W -g -ggdb -O0"
        make dist
        make windist
I have purposely highlighted the configure line change it to
./configure CFLAGS="-Wall -W -g -pg -ggdb -O2" I added -pg option and changed O2 from O0 for optimization purposes.

(2) Easiest step would be to modify all make files at one shot by entering following command
./configure CFLAGS="-pg -g -O2" CXXFLAGS="-pg -g -O2"
And after this command unluckily I bumped into an error saying configure: error: cannot guess build type; you must specify one it failed and ended unexpectedly. Luckily error wasn't that hard to solve. Basically I did some research on it, I found out that this error is related to demanding an update ./configure is not able to recognize aarch64 anymore cause I am trying to change the makefiles. It is kind of strange cause at first time when I ran ./configure for installation it didn't gave me an error. I think running ./configure must have recognized aarch64 version but by adding some parameters to it must have compromised aarch64 version as per my thinking.

NOTE: THE EXPLANATION ABOUT THE ./configure ERROR ABOVE IS JUST MY ASSUMPTION THERE COULD BE SOME OTHER REASON FOR THE ERROR TOO I RESEARCHED IT AND FOUND SOME RELEVANT REASONING ABOUT THE SAME SITUATION.

So I found a link to download some config files in error itself

ftp://ftp.gnu.org/pub/gnu/config/

While you are on this webpage click on README file there will be 2 links open both the links copy one by one
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD

and make a new config file named config.guess and config.sub. As you can see at the end of url the name of the files are given. SCP or SFTP those files to your md5deep folder and run the same command and it worked this time. After executing that command run

make clean
make

When you look at the output of ./configure and makefile we can see there will be -pg -g -O2 added to g++. It should create file called gmon.out as well

gprof md5deep>123.txt

It will create a readable file where profiling info would be there by viewing my profiling info its confusing cause many functions are getting multiple calls still time taken for it is 0 seconds. Here at the bottom I have shown it

granularity: each sample hit covers 4 byte(s) no time propagated

index % time    self children    called     name
                0.00    0.00       2/505         MD5Final [4]
                0.00    0.00     503/505         MD5Update [2]
[1]      0.0    0.00    0.00     505         MD5Transform [1]
-----------------------------------------------
                0.00    0.00       4/4           hash_context_obj::multihash_update(unsigned char const*, unsigned int) [48]
[2]      0.0    0.00    0.00       4         MD5Update [2]
                0.00    0.00     503/505

The file is pretty big and functions are getting calls from several other functions so I ended up with a function

multihash_update(unsigned char const*, unsigned int)

I think it is not possible to modify it anyhow cause its written perfectly still my next approach would be to try with different compiler flag it might get affected in terms of performance.

Thursday, 1 March 2018

SPO600 Project Stage 1

For this project I would be working on MD5DEEP software package used in the computer security, system administration and computer forensics communities communities to run large number of files through several cryptographic digests. Well basically this software uses hashing as their cryptographic method for encryption to keep file data secure and safe. Hashing is a method for reducing large inputs to a smaller fixed size output.

MD5DEEP consists of SHA variants like Tiger and WhirlPool. It has many more functionalities as decribed below

Recursive operation - Md5deep has the capacity to recursive analyze an whole registry tree. That is, figure the MD5 to each document in An registry and for each document in each subdirectory.

Comparison mode - Md5deep might accept a rundown about known hashes What's more think about them with An set from claiming enter files. The project camwood show Possibly the individuals information files that match those rundown of referred to hashes or the individuals that don't match. Hashes sets camwood make drawn from Encase, the national product reference Library, iLook Investigator, Hashkeeper, md5sum, BSD md5, What's more other non specific hash generating projects. Clients need aid welcome will include purpose will read different formats too!.

Time estimation - md5deep can produce a time estimate when it's processing very large files.

Piecewise hashing - Hash input files in arbitrary sized blocks

File type mode - md5deep can process only files of a certain type, such as regular files, block devices, etc.

credits:http://md5deep.sourceforge.net/

So my approach would be to explore MD5DEEP understand concepts, explain methods, implement and try to optimize its code for better performance. I downloaded MD5DEEP from https://github.com/jessek/hashdeep/releases just for information MD5DEEP is platform or architecture dependent so there might be chances you might come across platform or architecture dependent issues. Luckily I didn't come across such issue while downloading and installing it on Linux platform. The repository which has hashdeep source to download was last updated in 2014 which is pretty old and has hashdeep version 4.4 which turns out to be most latest and updated version indeed according to google sources.

After downloading it your local machine go to the directory where downloaded file is located there are few commands we need to fire in order to install this software.

-> Extracting the tar.gz

tar xvzf nameofthefile.tar.gz after extraction you will see folder of that extracted file go to that folder and type following command

->sh bootstrap.sh

->./configure

->make

->make install

After installation is done make a text file with some text in it in the same directory where you have installed it and test with the command md5deep nameof thefile.txt or hashdeep nameofthefile.txt in the same way you can test SHA variants and benchmark the result timing by simple adding time in front of md5deep for example:

time md5deep nameofthefile.txt //would result into
5444783fea966d71ed28da359a3cae9 /home/location/nameofthefile.txt
real    0m0.030s
user    0m0.000s
sys     0m0.016s

My next approach would be to export this attempt to aarch64 and x86_64 and try to benchmark it.

Now that I have fully setup md5deep on Aarch64 system the same way I did for my local system but there would be one more step to it. scp C:/path/directory/...tar.gz server@domain.com:/path/directory/
and the rest of the steps as same as before.-> Extracting the tar.gz tar xvzf nameofthefile.tar.gz after extraction you will see folder of that extracted file go to that folder and type following command
->sh bootstrap.sh
->./configure
->make
->make install DESTDIR=/home/path/directory install

Here we have added path to make install instead of installing in it same directory cause I ran into issues while installing it gave me an error saying PERMISSION DENIED, well you can try installing it in that particular directory you might succeed. 'make install' is complaining because you're trying to install into system directories that are only writable by the system administrator. This is actually a good thing, because it will prevent you from overwriting system files with your test files.

Our next step would be to create 3 files with different file sizes this could be done with this command dd if=/dev/urandom of=file.txt bs=1048576 count=10 will create a file of size count*bs with some random generated content in it. Just for information the content will not be readable. In above case will be 10 mb in the same way by changing count abd bs we could create remaining 2 files. Let me just introduce to the output after running or executing the above command

(10mb file)
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.125207 s, 83.7 MB/s

(105mb file)
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 13.4593 s, 7.8 MB/s

(1gb file)
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 12.4558 s, 84.2 MB/s

Checking number of words in the file
wc -l file.txt
41031 file.txt

Compiling and running all three files 100times with command time for i in {1..100};do md5deep file.txt;done. The time below is the total time of running the program 100times. So now we gotta take an average to find out the time when it will run single time.

BEFORE AVERAGE
(10 mb file)
real 0m4.260s
user 0m3.586s
sys 0m0.786s

(100mb file)
real 0m39.260s
user 0m34.493s
sys 0m5.738s

(1gb file)
real 6m28.003s
user 5m45.722s
sys 0m51.928s

AFTER AVERAGE
(10 mb file)
real 0m0.04260s
user 0m0.03586s
sys 0m0.00786s

(100mb file)
real 0m0.39260s
user 0m0.34493s
sys 0m0.05738s

(1gb file)
real 6m28.003s
user 5m45.722s
sys 0m51.928s

So my upcoming blog for project would be based on some comparison f md5deep with hashdeep algorithm, sha256. But the main purpose of phase 2 would be trying to implement altered build options running md5deep, make some changes in md5deep code to permit better optimization by the compiler if I could do it. I will make sure it doesn't affect Aarch64 systems white make such optimizations and changes. Well I found md5deep on github luckily and it has some files to look into md5.c and md5.h so I am still kinda understanding its coding pattern and will be starting to work on it real time soon.