Sunday, August 31, 2008

Hack attempts by country

I use denyhosts to block addresses that runs dictionary attacks on my SSH server.

GeoIP and python can be used to lookup country of origin of these addresses, and simple shell commands to generate a list of most common countries.

$ cat
import GeoIP, sys
gi =
for addr in sys.stdin.readlines(): print gi.country_name_by_addr(addr.strip())

$ grep ssh hosts.deny |cut -d " " -f 2 |python |sort |uniq -c |sort -nr |head
34 China
17 None
17 Korea, Republic of
11 United States
5 United Kingdom
5 Italy
5 Brazil
4 Thailand
4 Japan
4 Germany

China wins. But please note that there's 17 addresses that couldn't be resolved so the margin of error is pretty large.

Friday, August 29, 2008

Parse HTML using CSS selectors

lxml is a nice library for parsing XML and HTML with python. It can use CSS selectors to find nodes.

Here's an example that shows how smooth it is to use.
>>> from lxml.html import parse
>>> google = file("google_se.html") # saved google result page for "example"
>>> root = parse(google).getroot()

This one fetches all the anchor texts (truncated to not break the page.)
>>> [link.text_content()[:20] for link in root.cssselect(".g h3.r a")]
['Image results for ex', 'Example (rapper) - W', 'Example - Wikipedia,', ' - EXAMPL', 'Dynamic Programming ', 'example - definition', 'example - Definition', "Example - I don't wa", 'XML by Example - Goo', 'Example', 'example']

This one fetches all the link destinations (also truncated.)
>>> [link.get("href")[:20] for link in root.cssselect(".g h3.r a")]
['', 'http://en.wikipedia.', 'http://en.wikipedia.', 'http://www.myspace.c', '', 'http://www.thefreedi', 'http://www.merriam-w', '', '', 'http://www.example.o', 'http://www.docbook.o']

Monday, January 21, 2008

OpenWRT 7.09 for the NSLU2

The NSLU2 edition of OpenWRT 7.09 is finally ready for download! Actually this was a month ago, but I just noticed, hehe. I own a NSLU2 and it has been running OpenWRT 7.07 for a while. I am very pleased.
Changes since Kamikaze 7.07

- Swap is now enabled in the kernel
- The eth0 interface requests an IP address using DHCP
- Documentation updates
- UCI updates - uncommitted changes are now active on config reads
- PPP fixes
- Firewall fixes for dynamic interfaces
- Config enhancements for dnsmasq
- Prevent interfaces from accidentally being started twice at boot time
- Fix QoS for dynamically assigned interfaces

Woot! Swap! Praise FSM.

BTW, here's a good
optware mirror and openwrt mirror at OSUOSL.