Wednesday, September 2, 2009

Bangbros to directories

Bang Brothers member page after you click 'Next page of XYZ updates>>'. Save the page to you HDD and do so with additional pages if you need it.

Lets say you saved the page as bangbros.html. Here is afew commands you can fire against your html file to get a nice and clean directory structure for your clips and pics.


tr -d "\n\r\t" < bangbros.html ¦ sed -e "s/ //g" > bb2.txt

grep -Po "(..\d{4}).html.\>\<b\>(.*?)\</b\>\</a\>\</p\>(.*?)Added: (\w*?) (\d{2}), (\d{4})" bb2.txt >bb3.txt

sed -e "s/January/01/" -e "s/February/02/" -e "s/March/03/" -e "s/April/04/" -e "s/May/05/" -e "s/June/06/" -e "s/July/07/" -e "s/August/08/" -e "s/September/09/" -e "s/October/10/" -e "s/November/11/" -e "s/December/12/" bb3.txt >bb4.txt

sed -e "s/.html.>//" bb4.txt>bb5.txt

sed -e "s/<b>/-/" -e "s/<\/b><\/a><\/p>Added: /-/" -e "s/,//" bb5.txt>bb6.txt

sed "s/\(......\)-\(.*\)-\(..\) \(..\) \(....\)/mkdir \"\5-\3-\4 - (\1) - \2\"/" bb6.txt

Those few commands makes you go from some html whick looks like this:

...
<td align="left" valign="top" width="24%">
<p><a href="http://members.bangbrosnetwork.com/bangbus/intro/bb4222.html"><b>Spring Break Hottie</b></a></p>




Added: March 12, 2008<br>
<p>Website: <a href="http://members.bangbrosnetwork.com/bangbus/main-1.html">bangbus.com</a></p>




<div><img src="bangbus_files/small_7.gif" alt="bar 7" border="0" height="12" width="58"></div>
<p><small>Rating: 6.78 (674 votes)</small></p></td>
...

into a nice clean command list which you can run, that looks like this:

mkdir "2008-03-12 - (bb4222) - Spring Break Hottie"
mkdir "2008-03-05 - (bb4197) - Kangaroo spotting"
mkdir "2008-02-27 - (bb4173) - Cock Hungry SaraJay"
...

No comments:

Post a Comment