Thursday, March 3, 2016

Few custom scala hive udfs I wrote to simply some my sql queries on Hive.
Git repo link : https://github.com/lserinol/Hive-udfs


Hive-udfs

Collection of my Scala Hive UDFs

Requirements

  • SBT
  • Scala
  • Hive

Compile

  1. git clone/fork project https://github.com/lserinol/Hive-udfs.git 
  2. sbt assembly 
  3. add jar 'target/scala-version number/levent-hive-funcs-assembly-version number.jar' to HIVE_AUX_JARS_FILE_PATH 
    or use "add jar" command for Hive-CLI/Beeline

Registering UDFs

create temporary function lcrc32 as 'com.levent.hive.udfs.lcrc32'; 
create temporary function ldays_between as 'com.levent.hive.udfs.ldays_between'; 
create temporary function lmonths_between as 'com.levent.hive.udfs.lmonths_between'; 
create temporary function ilike as 'com.levent.hive.udfs.liLIKE'; 

com.levent.hive.udfs.lcrc32

  • takes a string input and returns crc32(long) value 

com.levent.hive.udfs.ldays_between

  • calculates days difference between two dates and returns number of days(int) value 

com.levent.hive.udfs.lmonths_between

  • calculates months difference between two dates and returns number of months(int) value 

com.levent.hive.udfs.liLIKE

  • Hive ilike function in-case sensitive like between two strings retuns boolean TRUE/FALSE 

Monday, May 6, 2013

Which processes are using your swap ? (Linux)

Here is a simple bash script which briefly shows you the processes using your Linux's swap area.
#!/bin/bash

exec 2>/dev/null
total=0
printf "pid\t\tprocess\t\tmemory (kB)\n"
while read -r me
do
 ps=$(cat $me/comm)
 res1=$?
 val=($(grep "^VmSwap" $me/status))
 res2=$?
 p=$(basename $me)
 if [[ $res1 -eq 0 && $res2 -eq 0 ]]; then
  if [ ${val[1]} -ne 0 ]; then
   printf "%d\t\t%s\t\t%s\n" "$p" "$ps" "${val[1]}"
   let total=$total+${val[1]}

  fi
 fi
done < <(find /proc/ -mindepth 1 -maxdepth 1 -type d -name "[0-9]*" )
echo "Total(kB):$total"


Sample output (processes not sorted by their memory usage)

Thursday, September 20, 2012

Ext4 sysfs parameters

Have you ever wondered how much data written to your Ext4 filesystem since it's first creation time or last mount time. Here is two ext4 sysfs parameters that will tell you the magic numbers you need.
/sys/fs/ext4//lifetime_write_kbytes
/sys/fs/ext4//session_write_kbytes
Here is some more ext4 sysfs paramters and their meanings, but mostly tuning of ext4 multiblock allocator. ext4 sysfs paramters

Tuesday, March 13, 2012

Freebsd ciss driver logical drive limit

Lately, I was faced with a strange problem on a HP server which uses HP Smart Array controller P800 which is running Freebsd. All my 24 hard drives which is hosted on externel HP enclosures was presented as 24 single raid0 logical volumes. Surprisingly, when the freebsd booted it doesn't presented the disks and disabled the ciss driver (HP Smart Array Controller) with following message

Mar 12 16:00:41 freebsd kernel: ciss1: adapter claims to report absurd
number of logical drives (24 > 15)

A quick check on the driver source code showed up that there is a hard limit defined to 15.(/usr/src/sys/dev/ciss/cissvar.h)

 #define CISS_MAX_LOGICAL        15


So I wanted to be sure if I change the number to a higher value , won't do any unwanted behavior later on the system. In that point I contacted Paul Saab where he explained the limit as  below.
 really that's done to limit the amount of memory needed upfront by the
driver.  I believe you can easily increase the number of drives
without issue as long as you have enough memory below 4GB.  Parts of
the ciss driver require that the memory you DMA from be under 4G


After the clarification, I just set the number to 32 and rebuild/install the kernel, restart the machine and a quick check on dmesg showed up all my 24 logical volumes presented on HP P800 controller.
Mar 13 10:32:37 freebsd kernel: da19: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
Mar 13 10:32:37 freebsd kernel: da20 at ciss1 bus 0 scbus2 target 19 lun 0 
Mar 13 10:32:37 freebsd kernel: da20:  Fixed Direct Access SCSI-5 device
Mar 13 10:32:37 freebsd kernel: da20: 135.168MB/s transfers
Mar 13 10:32:37 freebsd kernel: da20: Command Queueing enabled
Mar 13 10:32:37 freebsd kernel: da20: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
Mar 13 10:32:37 freebsd kernel: da21 at ciss1 bus 0 scbus2 target 20 lun 0 
Mar 13 10:32:37 freebsd kernel: da21:  Fixed Direct Access SCSI-5 device
Mar 13 10:32:37 freebsd kernel: da21: 135.168MB/s transfers
Mar 13 10:32:37 freebsd kernel: da21: Command Queueing enabled
Mar 13 10:32:37 freebsd kernel: da21: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
Mar 13 10:32:37 freebsd kernel: da22 at ciss1 bus 0 scbus2 target 21 lun 0 
Mar 13 10:32:37 freebsd kernel: da22:  Fixed Direct Access SCSI-5 device
Mar 13 10:32:37 freebsd kernel: da22: 135.168MB/s transfers
Mar 13 10:32:37 freebsd kernel: da22: Command Queueing enabled
Mar 13 10:32:37 freebsd kernel: da22: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
Mar 13 10:32:37 freebsd kernel: da23 at ciss1 bus 0 scbus2 target 22 lun 0 
Mar 13 10:32:37 freebsd kernel: da23:  Fixed Direct Access SCSI-5 device
Mar 13 10:32:37 freebsd kernel: da23: 135.168MB/s transfers
Mar 13 10:32:37 freebsd kernel: da23: Command Queueing enabled
Mar 13 10:32:37 freebsd kernel: da23: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
Mar 13 10:32:37 freebsd kernel: da24 at ciss1 bus 0 scbus2 target 23 lun 0 
Mar 13 10:32:37 freebsd kernel: da24:  Fixed Direct Access SCSI-5 device
Mar 13 10:32:37 freebsd kernel: da24: 135.168MB/s transfers
Mar 13 10:32:37 freebsd kernel: da24: Command Queueing enabled
Mar 13 10:32:37 freebsd kernel: da24: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)