Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Thursday, June 6, 2024

Python hands-on self learning

Classroom training may not be sufficient to learn Python. You have to get your hands dirty. Here are some assignments that you could do, to learn Python the self-help way.

Exercise 1. In a Python script, accept command line arguments. Display all the arguments and also the number of arguments.


Expected output:

$ ex_01.py here there
arguments:
here
there
Number of arguments: 2
$

Exercise 2. Accept a filename as command line argument. Display the contents of that file.

Expected output:

$ cat > ringfile.txt
Three Rings for the Elven-kings under the sky,
Seven for the Dwarf-lords in their halls of stone,
Nine for Mortal Men doomed to die,
One for the Dark Lord on his dark throne
In the Land of Mordor where the Shadows lie.
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the Land of Mordor where the Shadows lie.
(Ctrl+D)$
$
$ ex_02.py ringfile.txt
Three Rings for the Elven-kings under the sky,
Seven for the Dwarf-lords in their halls of stone,
Nine for Mortal Men doomed to die,
One for the Dark Lord on his dark throne
In the Land of Mordor where the Shadows lie.
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the Land of Mordor where the Shadows lie.

$

This is similar to displaying the contents of a file using cat command.


Exercise 3. Accept a filename as command line argument. Display the contents of that file in the opposite order that they appear in the file.

Expected output:

$ ex_03.py ringfile.txt
In the Land of Mordor where the Shadows lie.
One Ring to bring them all and in the darkness bind them
One Ring to rule them all, One Ring to find them,
In the Land of Mordor where the Shadows lie.
One for the Dark Lord on his dark throne
Nine for Mortal Men doomed to die,
Seven for the Dwarf-lords in their halls of stone,
Three Rings for the Elven-kings under the sky,

$


Exercise 4. Accept a string as first argument and a filename as second argument. Display all lines of that file which contain the given string.

Expected output:

$ ex_04.py them ringfile.txt
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them

$


Exercise 5. Accept a username as command line argument. Display if that user is currently logged in or not. Run who or w command and use its output to determine if the given user is logged in or not.

Expected output:

$ ex_05.py yogesh
yogesh is not logged in
$
$ ex_05.py root
root is logged in
$
$ ex_05.py roo
roo is not logged in

$


Exercise 6. Accept command line arguments. Consider all command line arguments as user names and for all the given user names, display if they are logged in or not.

Expected output:

$ ex_06.py yogesh pts roo root
yogesh is not logged in
pts is not logged in
roo is not logged in
root is logged in

$


Exercise 7. Display today's date in the format: 22-Aug-2008

Exercise 8. Display yesterday's date in the format: Thu Aug 21 16:58:10 2008

Exercise 8.1. Display tomorrow's date in the format: 21-Aug-2008


Exercise 9. Declare a list as follows:

nums = [5, 10, 15, 23, 20, 24, 30, 33, 40, 13]

Sort this list numerically in ascending order.

Exercise 9.1. Sort this list numerically in descending order.


Exercise 10. Declare a list as follows:

nums = [5, 10, 20, 23, 15, 23, 20, 5, 24, 30, 33, 40, 3, 13]

Find the duplicate elements from this list


Exercise 11. Declare a list as follows:

nums = [5, 10, 20, 23, 15, 23, 20, 5, 24, 30, 33, 40, 3, 13]

Find unique elements from this list


Exercise 12. Accept a string as command line argument. Declare a dictionary as follows:

food={'apple':'red', 'banana':'yellow', 'tomato':'red', 'spinach':'green', 'lemon':'yellow',}

Check whether the given string is present as a key in this dictionary.

Expected output:

$ ex_38.py apple
We've got apple
$
$ ex_38.py orange
We've got no orange

$


Exercise 13. Declare a dictionary as follows:

dict={d:12, b:20, g:2, a:6, e:1, h:13, c:8, f:5}

Display keys and values from this dictionary sorted by keys.

Expected output:

$ ex_13.py
key = a value = 6
key = b value = 20
key = c value = 8
key = d value = 12
key = e value = 1
key = f value = 5
key = g value = 2
key = h value = 13

$


Exercise 14. Declare a dictionary as follows:

dict={d:12, b:20, g:2, a:6, e:1, h:13, c:8, f:5}

Display keys and values from this dictionary numerically sorted by values.

Expected output:

$ ex_14.py
key = e value = 1
key = g value = 2
key = f value = 5
key = a value = 6
key = c value = 8
key = d value = 12
key = h value = 13
key = b value = 20

$


Exercise 15. Declare two dictionaries as follows:

food={'apple':'red', 'rice':'white', 'banana':'yellow', 'tomato':'red', 'spinach':'green'}

fruits={'plum':'red', 'banana':'yellow', 'blueberry':'blue', 'mulberry':'black', 'apple':'red', 'pear':'green'}

Find common keys in these dictionaries.

Expected output:

$ ex_15.py
apple
banana

$


Exercise 16. Declare two dictionaries as follows:

food={'apple':'red', 'rice':'white', 'banana':'yellow', 'tomato':'red', 'spinach':'green'}

fruits={'plum':'red', 'banana':'yellow', 'blueberry':'blue', 'mulberry':'black', 'apple':'red', 'pear':'green'}

Find keys that are present in dictionary %food and not present in dictionary %fruits.

Expected output:

$ ex_16.py
rice
tomato
spinach

$


Exercise 17. Write a program to read file /etc/passwd and prepare a dictionary as follows:

(a) keys of the dictionary would be the user names

(b) corresponding value would be the home directory of that user

Also, display contents of this dictionary.


Exercise 18. Write a program to read file /etc/passwd and prepare a dictionary as follows:

(a) keys of the dictionary would be the user names

(b) corresponding value would be another dictionary as follows:

i. key = uid value = user id of that user

ii. key = homedir value = home directory of that user

iii. key = shell value = default shell of that user

Also, display contents of this dictionary.


Exercise 19. Write a program to find and print the longest word in a text file.

Expected output:

$ python find_longest_word.py ringfile.txt
Elven-kings
Dwarf-lords

$


Exercise 20. Write a function that simulates the roll of a dice. That is, it generates a random number between 1 and 6.


Exercise 21. Accept a filename as command line argument. Display the word that occurs most in it. Also display the number of occurrences of that word.

Expected output:

$ python ex_21.py ringfile.txt
highest occurance: the (9)

$


Exercise 22. Accept a directory path as command line argument. In the given directory, find the file having oldest modification time. Display the file name along with how many days ago it was modified.

Expected output:

$ ex_22.py /foo/
/foo/ does not exist
$
$ ex_22.py /etc/passwd
/etc/passwd is not a directory
$
$ ex_22.py /etc/
motd was modified 167 days ago

$


Exercise 23. Write a python program to validate a PAN codes.

Get PAN codes to be validated on the command line. There could be more than one PAN codes provided.

$ ./pan_code_validator.py BETPK1234M

$ ./pan_code_validator.py CRLCT3456G H2YPJ5678L

Q. What is a PAN number?

A. Please see https://en.wikipedia.org/wiki/Permanent_account_number

No bonus points if you could write a solution within five minutes. A comprehensive solution is better than a fast and clumsy one.


Exercise 24. Accept a filename as command line argument. Assuming that the file passed is a .json file, read its contents and display them.

Before reading the file, make sure that:

(a) it exists

(c) it is readable

(c) it is a simple file (not a directory or a device file)

(d) if the file is not a well formatted .json, display an appropriate error and exit.


Exercise 25. Write a python program to connect to a remote host using the SSH protocol, run ls command, and show the output.


Exercise 26. Write a python program to create and delete directories repeatedly, starting from a given directory path. Obviously, the program should create directories before they are be deleted. This program would be useful for checking certain features of filesystem, and also NFS and CIFS shared drives.

Arguments to this program should be as listed below.

1. directory path - starting point for creating / deleting directories

2. (optional) duration - run for how much duration, in seconds. Default : 300 seconds (5 minutes)

3. (optional) dontstop - run continuously till you stop the script by sending a signal using ctrl + c. This option should override duration.

4. (optional) dirprefix - string to be used while naming directories

Sub-directories in each directory should be named as dir1, dir2, dir3 and so on. If an optional argument dirprefix is mentioned, the string supplied along with it should be used while naming sub-directories.

Obviously, this program should not delete a directory that does not exist. Also, a thread / process should not create a directory which is already created / being created by another thread / process.


Exercise 27. Write a python program to convert English text to Morse code and vice versa. Or better yet, write two separate python programs, one to convert English text to its equivalent Morse code, and another to convert Morse code to its equivalent English text.

Get names of two files on command line, one for input and one for output, as shown below.

$ ./english_to_morse.py --infile notes.txt --outfile notes_in_morse_code.txt

$ ./morse_to_english.py --infile notes_in_morse_code.txt --outfile notes.txt

Read the English text / Morse code from the input file and write the converted Morse code / English text to the output file.

Q. What is Morse code?

A. See http://en.wikipedia.org/wiki/Morse_Code

Q. Which ASCII characters are considered valid for converting to Morse code?

A. Let's refer to http://en.wikipedia.org/wiki/Morse_Code#Letters.2C_numbers.2C_punctuation

Q. Do you know a web site that does this conversion?

A. There are many. Here's one - http://www.onlineconversion.com/morse_code.htm

Q. What would be the criteria to evaluate my solution?

A. No bonus points if you could write a solution within 10 minutes. A comprehensive solution is better than a fast and clumsy one.

You'll surely get bonus points if you could use some python package rather than writing all of the code yourself from scratch.


Friday, July 17, 2020

Python code to generate CAPTCHA

Today I wrote python code to generate CAPTCHA images.  Before we jump into the code, let's recap what a CAPTCHA is.  CAPTCHA is an acronym for Completely Automated Public Turing test to tell Computers and Humans Apart.  This is one type of a challenge–response test that is used in computing to determine whether the user is a human or not.  This is implemented at places where we want only humans to proceed ahead, and stop others (bots et al.)  How effective is the method of using CAPTCHA, and can it be circumvented is not a topic I take here.  Right now, let us get to writing python code for generating CAPTCHA images.

Well, a python library named captcha is available that you could use to generate audio and image CAPTCHAs.  You could install this library and use it to write your python code for generating audio and image CAPTCHAs.  If you want to generate image CAPTCHAs from python code yourself, without using the python's captcha library, do read on.  Wait.  Why would someone write their own code, while a standard library is available?  This is an obvious question.  Well, in most cases we should use the standard library.  Only in some exceptional cases we should write our own code.  In cases when standard library is not available, or is not allowed to be installed.  And in cases when we want some customization in the implementation.  What we have here could be one such case, where we want to customize how the image CAPTCHAs are being generated.

Here is the python code I wrote to generate image CAPTCHAs.

from PIL import Image
from PIL import ImageDraw
from PIL import ImageFont
import string
import random


def draw_text(img, x_given, y_given, text_given, font_given, fill_given):
    d = ImageDraw.Draw(img)
    d.text((x_given, y_given), text_given, font=font_given, fill=fill_given)


def char_selecting(size=6):
    characters = "abdefghijklmnqrtuy123456789ABDEFGHIJKLMNQRTUY"
    # Exclude letters that look similar in capital and small cases
    # Why?
    # We are randomly choosing a font size.  Bigger to smaller.
    # So can't tell whether you are looking at C or c in that situation
    # Exclude zero because we want to avoid confusion between 0 and O and o
    selection = ""
    for x in range(0, size):
        char = characters[random.randint(0, len(characters) - 1)]
        selection = selection + char
    return selection


image_x_size = 200
image_y_size = 100
img = Image.new('RGB', (image_x_size, image_y_size), color = (250, 250, 250))

word = char_selecting(6)    # We want 6 characters in our CAPTCHA
print word

x_pos = 10    # Position of the first character in the image

for char in (list(word)):
    print char

    font_size = random.randint(16, 50)
    # Font size smaller than 16 is too small
    # Font size bigger than 50 is too big
    font_selected = ImageFont.truetype('/usr/share/fonts/gnu-free/FreeMono.ttf', font_size)

    fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
    # Chose a random colour

    y_random = random.randint(0, 30)
    # All characters should not be in a horizontal line.
    # So shift position of each character randomly

    draw_text(img, x_pos, y_random, char, font_selected, fill_selected)

    x_pos = x_pos + 30
    # Position of the next character
    # We are randomly choosing a font size for each character
    # So position of the next character should not be chosen randomly

mid_x = image_x_size / 2
mid_y = image_y_size / 2

first_half_x = random.randint(1, mid_x - 1)
first_half_y = random.randint(1, mid_y - 1)

second_half_x = mid_x + random.randint(1, mid_x - 1)
second_half_y = mid_y + random.randint(1, mid_y - 1)

print "Going to draw a line from", first_half_x, ",", first_half_y, "to", second_half_x, ",", second_half_y
d = ImageDraw.Draw(img)
d.line((first_half_x, first_half_y, second_half_x, second_half_y), 10)

img.save('captcha.png')   # Save the image as a file on the disk


Here are some of the CAPTCHA images that I generated from this code.








Now let us see this code in detail.

from PIL import Image
We are using the Image module from the PIL library.

from PIL import ImageDraw
We are using the ImageDraw module from the PIL library.

from PIL import ImageFont
We are using the ImageFont module from the PIL library.


def draw_text(img, x_given, y_given, text_given, font_given, fill_given):
    d = ImageDraw.Draw(img)
    d.text((x_given, y_given), text_given, font=font_given, fill=fill_given)

I have written this function to draw the given text at the specified place in the given image.  The font and color to be used must be specified.  So why did I write a separate function for this?  Why not simply do this in the main code?  Well, you could do that.  Here I wrote a generic function, and used it.  So that tomorrow if I have to extend this code, a generic function would be useful.


def char_selecting(size=6):
    characters = "abdefghijklmnqrtuy123456789ABDEFGHIJKLMNQRTUY"

    # Exclude letters that look similar in capital and small cases
    # Why?
    # We are randomly choosing a font size.  Bigger to smaller.
    # So can't tell whether you are looking at C or c in that situation
    # Exclude zero because we want to avoid confusion between 0 and O and o
    selection = ""
    for x in range(0, size):
        char = characters[random.randint(0, len(characters) - 1)]
        selection = selection + char
    return selection

This is the function where the magic happens.  The magic of choosing a set of characters that we want to show in the CAPTCHA image.  Caller may choose the length of the string, by passing a numeric value.  Else, a default of 6 is taken.  We randomly choose the characters and form a string using them.  While choosing the characters in the string, I have not considered all of the upper case letters, small case letters, and numbers.  Why?  Allow me to explain please.  By looking at the CAPTCHA images that I have generated, you must have noticed that font size of each character varies.  I did that deliberately.  To avoid the text being recognized by an OCR software.  But when we have a mix of small and big font sizes, how could you tell the difference between c and C, or between z and Z?  Same for P, S, V, W, and X.  O is even difficult, because a 0 (zero) also looks similar.  So I chose not to use these characters.  So that we don't create a confusion for humans.

image_x_size = 200
image_y_size = 100
img = Image.new('RGB', (image_x_size, image_y_size), color = (250, 250, 250))
This is the start of the main code.  I have chosen image size of 200 x 100 pixels.  I have chosen background color which is almost white, but not exactly white.

word = char_selecting(6)    # We want 6 characters in our CAPTCHA
print word
A call to function char_selecting gets us a string of length 6, made up of randomly selected characters.

x_pos = 10    # Position of the first character in the image
In the loop that follows, you will observe that I keep on increasing this value, for the subsequent characters in the image.

for char in (list(word)):
    print char
From the string that we have prepared, here we take one character at a time, and work on it.

    font_size = random.randint(16, 50)
    # Font size smaller than 16 is too small
    # Font size bigger than 50 is too big
    font_selected = ImageFont.truetype('/usr/share/fonts/gnu-free/FreeMono.ttf', font_size)
As mentioned earlier, I have deliberately chosen a random font size for each character.  So that the resulting text could not be recognized by an OCR software.  Here I am using the same font for all characters.  If you want, you could obtain a list of fonts available in your system, and randomly choose a different font for each character.  That implementation I leave up to you.  Some weird fonts may be present in the system, which would render some of the characters beyond recognition by humans.  Considering this possibility, I did not do that.  I found a font in which the characters are rendered in a way that is easy for recognition by humans.  And I am sticking to that font.

    fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
    # Chose a random colour
We are deliberately choosing a different color for each character.  So that the resulting text could not be recognized by an OCR software.

    y_random = random.randint(0, 30)
    # All characters should not be in a horizontal line.
    # So shift position of each character randomly
In my opinion, this is an important trick.  If all characters appear in an horizontal line, there is a possibility of them being recognized by an OCR software.  To minimize this possibility, we shift position of each character.  In the range of 30 pixels.

    draw_text(img, x_pos, y_random, char, font_selected, fill_selected)

    x_pos = x_pos + 30
    # Position of the next character
    # We are randomly choosing a font size for each character
    # So position of the next character should not be chosen randomly
Function draw_text is where we draw the character in the image.  Now let's see what precautions we have taken so that the text could not be recognized by an OCR software.
1. Color of each character is different.
2. Font size of each character is different.
3. Characters are not in a horizontal line.  They are slightly off-positioned.  Well enough to fool the OCR software.  But not too out-of-position.  So that humans don't have a difficulty in recognizing the sequence.

Now let's add something more in the image, which will prevent the OCR software from recognizing the text.  Let's draw a line.

mid_x = image_x_size / 2
mid_y = image_y_size / 2

first_half_x = random.randint(1, mid_x - 1)
first_half_y = random.randint(1, mid_y - 1)

second_half_x = mid_x + random.randint(1, mid_x - 1)
second_half_y = mid_y + random.randint(1, mid_y - 1)

print "Going to draw a line from", first_half_x, ",", first_half_y, "to", second_half_x, ",", second_half_y
d = ImageDraw.Draw(img)
d.line((first_half_x, first_half_y, second_half_x, second_half_y), 10)
First we decide the position of the line in the image.  We consider the image in two parts, say left part and right part.  In the left part, we chose a random position, one end point of the line.  In the right part, we chose a random position, the other end point of the line.  And we draw the line, using the end points that we have chosen.  Well, you could argue that a better method could have been used to draw the line.  Yes, I agree.  I leave this part up to you.  To implement a better method to draw a line.  Or may be some other geometrical object.  Or may be two geometrical objects.  Basically we want to have something that would prevent the OCR software from recognizing the text.

img.save('captcha.png')   # Save the image as a file on the disk
Finally we save the image as a file named captcha.png
Is this code perfect?  I don't claim it to be.  Does this code generate CAPCHAs that are easily recognizable by humans, but could never be recognized by OCR software?  I don't claim that either.  If you try hard enough, you might find an OCR software that recognizes the text.  And then what?  Well, if I'd get to know that, may be I'd change my code so that the flaw is removed.  After all, this is a game of cat and mouse that we all are playing, right?

And we did not generate audio CAPCHA yet.  So let's do that.

from gtts import gTTS

word = "TAAaGr"

text = ""
for char in (list(word)):
    text =  text + char + " "

speech = gTTS(text = text, lang = 'en', slow = True)

speech.save("captcha.mp3")

We are using the gtts library for this purpose.  For each character to be clearly audible, we are separating them by spaces.  You'd notice there is no difference in which small case letters and upper case letters are spoken.  So, if we have to implement image and audio CAPTCHA together, then along with numbers we should use only lower case letters or only upper case letters.

What if we could make the challenge–response test a little more difficult to be solved by a non-human, while keeping it easy for humans.  One way to do this is to show two numbers, ask what is their addition, and check the response.  Let's see python code for this.

from PIL import Image
from PIL import ImageDraw
from PIL import ImageFont
import random



def draw_text(img, x_given, y_given, text_given, font_given, fill_given):
    d = ImageDraw.Draw(img)
    d.text((x_given, y_given), text_given, font=font_given, fill=fill_given)


image_x_size = 150
image_y_size = 70
img = Image.new('RGB', (image_x_size, image_y_size), color = (250, 250, 250))

num1 = random.randint(1,9)
num2 = random.randint(1,9)
total = num1 + num2
print num1, "+", num2, "=", total

font_size = random.randint(16, 50)
# Font size smaller than 16 is too small
# Font size bigger than 50 is too big

font_selected = ImageFont.truetype('/usr/share/fonts/gnu-free/FreeMono.ttf', font_size)

fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
# Chose a random colour

x_pos = 10    # Position of the first character in the image

y_random = random.randint(0, 20)
# All characters should not be in a horizontal line.
# So shift position of each character randomly

draw_text(img, x_pos, y_random, str(num1), font_selected, fill_selected)

fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
x_pos = x_pos + 30
y_random = random.randint(0, 20)
draw_text(img, x_pos, y_random, "+", font_selected, fill_selected)

fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
x_pos = x_pos + 30
y_random = random.randint(0, 20)
draw_text(img, x_pos, y_random, str(num2), font_selected, fill_selected)

fill_selected = (random.randint(0, 200), random.randint(0, 200), random.randint(0, 200), random.randint(0, 255))
x_pos = x_pos + 30
y_random = random.randint(0, 20)
draw_text(img, x_pos, y_random, "=", font_selected, fill_selected)

img.save('captcha_num.png')   # Save the image as a file on the disk

Here are some of the CAPTCHA images that I generated from this.




For simplicity, I chose single-digit integers.  You could choose bigger numbers, if that suits you.