Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Retrieving Flash Videos from the Internet: The Hard Way

By mybostinks in Internet
Wed Jan 02, 2008 at 11:00:16 PM EST
Tags: downloading flash videos, internet, yoMamma (all tags)
Internet

Recently I had lots of time on my hands visiting relatives and in-laws. Most of the time I was able to get an Internet connection. Sometimes I had to leech my connection from unsecured wireless access points. Some of my relatives and friends posted videos on youtube.com but they wanted to save other flash videos on their hard drive or USB drive. At first this can be frustrating if you don't understand what is going on. So I decided to download them the hard way. For the faint-of-heart, you can scroll all the way to the bottom of the article for the easy way to download flash video and audio files.

Why do this the hard way?
First I wanted to understand why it was less than straight forward to do. I understand why proprietary software developers are scared right now. Software is slowly moving to Internet applications. As most of you know, the Internet is wide open, even SSH and SSL to some degree.

With youtube.com, Google video and some of the more popular video sites you can download and install software to automatically grab the video or audio flash file. However, I had a lot of time on my hands and there are many sites where a video downloader does not work for the particular site.

If you find doing it yourself interesting then here's one way to download flash video files. Keep in mind this only works with embedded flash video or audio streams.


The tools
First you will need to install a couple of programs. The programs described below have both Windows and Linux/*nix versions. Below are where to get the Windows versions. If you are partial to Linux, you can use your favorite package manager to download and install what you need for the distribution. For the *BSDs, you can use the ports collections to install the utilities.

I will describe how to use these tools to capture the packets I want. If you are a programmer, then you can take what I describe and write a script or program in the language of your choice. I wrote mine in Ruby and also ported it to Perl. In Ruby at least it is trivial.

Wireshark - This is the former Ethereal packet sniffer. Wireshark not only has a GUI front end, it also has a command line interface (tshark).
Winpcap - If you install the newest version of Wireshark it will install winpcap as well.
HTTP Sniffer - Optional if you don't want to mess with wireshark or tshark. In some ways, HTTP Sniffer is better suited but a little harder to work with because it is still under development.
wget or curl - Either of these command line programs are great to use. My preference for most tasks is wget. Curl has features wget does not have. For our purposes however, wget will work fine.
VideoLan - VideoLan is a client-side Audio-Video player that plays .flv files on your computer as well as straight off a video website.

Very brief howto on packet sniffing
After installing the above programs you will need to determine the interface you are using. In Linux or Unix tshark or if you use tcpdump will determine your interface automatically. In Windows you should issue the following command from the C:> prompt:
tshark -D
you will see something like the following:
1. \Device\NPF_GenericDialupAdapter (Adapter for generic dialup and VPN capture)
2. \Device\NPF_{EB5F7518-B4A0-4D10-B795-ED9744D59228} (VMware Virtual Ethernet Adapter)
3. \Device\NPF_{15A058C5-FA9F-4FA5-A38C-74D322C36EA4} (Broadcom 802.11b (Microsoft's Packet Scheduler) )
4. \Device\NPF_{026B866A-79B3-43C7-A0F0-6809D6C7C4E9} (VMware Virtual Ethernet Adapter)
5. \Device\NPF_{AAD42C94-85EC-47A6-B4D7-EC374D0628A2} (National Semiconductor Corp. DP83815 10/100 MacPhyter3v PCI Adapter (Microsoft's Packet Scheduler) )

I had my ethernet interface disconnected and was using the Broadcom 802.11b interface(# 3.). When you have determined the active interface you can next check to see if it is correct by issuing the following from the command line:
C:\> tshark -i 3
You will see immediately something similar to the following:
17.846631 192.168.1.100 -> 192.168.1.104 TCP informer > netbios-ssn [FIN, ACK] Seq=1008 Ack=811 Win=64725 Len=0
17.846871 192.168.1.104 -> 192.168.1.100 TCP netbios-ssn > informer [ACK] Seq=811 Ack=1009 Win=65535 Len=0
17.849765 192.168.1.104 -> 192.168.1.100 TCP netbios-ssn > informer [FIN, ACK] Seq=811 Ack=1009 Win=65535 Len=0
17.849825 192.168.1.100 -> 192.168.1.104 TCP informer > netbios-ssn [ACK] Seq=1009 Ack=812 Win=64725 Len=0

If you don't see anything scrolling on your screen you chose the wrong interface and you should attempt to rediscover the correct interface using the:
C:\> tshark -D
command again.

Here is another brief explanation of the above lines. The first field on the left indicates the timestamp of the current session. The next field is the source ip address and the 3rd field is the destination ip address. The fourth field is the protocol and the last field is a description of the event.

Most if not all of this session has nothing useful for what we are going to use it for. You can see that my laptop is making netbios requests to a Samba server. The capture shows the session timestamp since the capture started, source ip address, destination ip address, TCP or UDP, source port and destination port and other packet information not relevant to the task at hand.

Looking for the correct packets
Now that we know how to capture packets, we should move on to how to find the one that will do something useful. We want to capture packets that will tell us where youtube's videos are stored so that we can download them.

All the bits and bytes swarming around the Internet are visible. Some packets are encrypted and not humanly readable but they are there nonetheless. HTTP traffic is very visible and public. Sometimes it is encrypted and sometimes it is obscured either by necessity or on purpose. Youtube .flv files are not encrypted so we won't need to worry about those. Youtube .flv files are somewhat obscured and that is what we want to discover. The information we want will be a TCP stream that carries the HTTP header payload that will tell us exactly where the flv file is located.

Capturing the packets
First start up wireshark or tshark but don't start capturing traffic yet. Next in your browser search for a video you want to download. In this example I am going to download the Snowbound video by Donald Fagan. Once you have searched for and found the video start wireshark. The URL for this particular video is: Snowbound - Donald Fagen. Play the video while capturing packets with wireshark. It is OK to stop the video once it starts streaming.

After the video is stopped, switch to the wireshark program and stop capturing packets. In the filter box type http and click on 'Apply' and wireshark will filter out everything but the http protocol. This is what we want.

Once you have done the following you will notice that there are a lot of GET requests. This is what we are looking for. They are in the HTTP headers and possess the information we need. Next to one of these GETs you should see the following:
get_video?video_id=0MGtr121fFI
Now highlight and right-click on the line and choose 'Follow TCP Stream' from the menu items. A pop-up will appear. Scroll down to near the bottom of the pop-up and find the world 'Location'. You should see something like the following:
http://chi-v9.chi.youtube.com/get_video?video_id=0MGtr121fFI
You will notice that it is a typical URL. Also the host part of the URL at youtube: chi-v9.chi.youtube.com will change. This is probably due to load balancing of the youtube servers. Now all you need to do in order to confirm that this works is to copy this URL and paste it into your browser. When you hit the Enter key in Firefox a pop-up will appear asking if you want to save it or not. You want to save it to disk and give the get_video default name another name. It can be anything you want but make sure you save it with the extension of .flv for flash video. Now if you have installed VideoLan or another flash video player then you can click on that file and play it from your hard drive.

Another example: redtube.com
Redtube.com is another flash video sharing site. Unlike youtube.com it is NSFW so you are warned.

Again using wireshark, find the video you want to download and before you click on the thumbnail image, start capturing packets. Click on the thumbnail and it will take you to the flash video stream. The http header we are looking for is:
GET /_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv?start=0
The above get is what we need but we need to chop off the parameter '?start=0'. Otherwise the download will fail.

Next, we need to find the host server. We do that by searching for "host" and we will find:
dl.redtube.com
Now putting the URIs all together in a URL we get:
http://dl.redtube.com/_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv
This will now get us the video. If you again put this in your browser's address bar a pop-up will appear and ask if you want to save the file. Go ahead and save it and give it a meaningful name. WARNING! and all that stuff...this video is not safe for work.

Google videos (googlevideo.com)
Now that Google owns Youtube they are using those servers. Things get a little confusing and grabbing a Google video is different. This time I will use the HTTP Sniffer utility. This utility not only fetches the headers we want but also the streamed video data as well. Start the utility like you did with tshark or wireshark and 'debug' the browser you are using. HTTP Sniffer uses the debug term instead of the capture term. This was confusing for me at least. Then start streaming the Google video of your choice. Then go back to the HTTP Sniffer utility and you will see data. Right-click on one of the rows and select Save All. You may want to stop the Google video stream because it can produce quite a large log file. Remember where you save the log file.

Now you want to open the log file in your favorite text editor (vim or gvim is well suited for this) and search for something like the following:
HTTP/1.1 302 Found Location: http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube. com
Connection: close


GET /get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube.com HTTP/1.1
Host: 74.125.1.80
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8, image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
At this point you can put together the URIs again or use the 'Location' field and you should come up with the following: http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube. com put this in the Address field of your browser and once again a pop-up will appear asking if you want to save the file.

Using wget
First download wget. It is included on many *nix and Linux distributions. For Windows you can download a pre-compiled version here. I use cygwin on my Windows boxes so for me it was easy to install.

Once you have confirmed that you can save youtube videos it might be easier to use the wget or curl utilities to grab them. Wget has the ability to download files in many different ways. Wget runs from the command line in either Windows or *nix/Linux. The command line options you can use are:
C:\> wget -O MyPr0n.flv <OneOfTheURLsBelow>
http://chi-v9.chi.youtube.com/get_video?video_id=0MGtr121fFI
http://dl.redtube.com/_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv
http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube. com
The -O option gives you the ability to save the file to a more human readable name.

Other hard ways to get flash video files
1. Search through the javascript on the web page by 'Viewing Page Source'. I started out doing it this way, it is tedious but worth doing if you want to learn how it is coded.
2. Decompile the .swf file. I read about this one but haven't tried it. Some swf files are compressed and you will first have to decompress them, then decompile them. There are several free and commercial software that do this.

OK, so here's the easy way
Now that we know what to look for you can write a script or program to automate all this. I was too lazy to do this and besides this article is getting too long. If all this is a bit much then just download the Orbit Downloader 2.0. There are any number of video/audio downloaders but Orbit is my favorite.

A final sidenote
I saw this comment by mr strange and used the above technique to download and save the montage about trhurler he refers to. I always enjoyed trhurler's comments and diaries since 2001 when I first became a Kuron. If you are interested the following link will download the flash video from myspacetv.com or paste it into a flash player...enjoy and Rest In Peace trhuler.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Google
o Wireshark
o Winpcap
o HTTP Sniffer
o wget
o curl
o VideoLan
o Snowbound - Donald Fagen
o Redtube.co m
o here
o cygwin
o Orbit Downloader 2.0
o this comment by mr strange
o Rest In Peace trhuler
o Also by mybostinks


Display: Sort:
Retrieving Flash Videos from the Internet: The Hard Way | 37 comments (30 topical, 7 editorial, 0 hidden)
another option: (3.00 / 2) (#2)
by horny smurf on Tue Jan 01, 2008 at 11:48:52 PM EST

use privoxy (or some other proxy) which logs requests.

excellent suggestion (none / 1) (#3)
by mybostinks on Wed Jan 02, 2008 at 12:05:39 AM EST

i have used privoxy before with TOR.

[ Parent ]
FUCKING PAEDO (1.20 / 5) (#11)
by ray eckson on Wed Jan 02, 2008 at 01:23:40 PM EST

FILING ABUSE REPORT


wampsy: hey ray why don't you start up a site. you could call it ray5.
rusty: I gotta fix that stupid cancel bug.
booger: How's that for daring to get ray eckson all sniffy, you cow?
poopy: Not that I'm gay or anything, but for you I might make an exception.
[ Parent ]
They turned that off as well (none / 1) (#12)
by mybostinks on Wed Jan 02, 2008 at 01:27:22 PM EST

it was part of the clean-up of K5.

[ Parent ]
One word: keepvid (3.00 / 2) (#5)
by b1t r0t on Wed Jan 02, 2008 at 01:13:09 AM EST

You could do all that crap, or you could think to yourself: "Self, is it possible that someone has already done all this before so I don't have to?" and find http://keepvid.com/

-- Indymedia: the fanfiction.net of journalism.
I already addressed that (1.50 / 2) (#6)
by mybostinks on Wed Jan 02, 2008 at 01:27:11 AM EST

at the very end of the article under the 'easy way'.

There are lots of ways to do this. I was only trying to show how you can discover doing it. There is more than one way to do it.

Great link btw.

[ Parent ]

And I failed it (none / 0) (#8)
by b1t r0t on Wed Jan 02, 2008 at 08:42:20 AM EST

It was hidden sufficiently well that I didn't notice at first glance, until after I posted that. Still, keepvid is what I use, and they seem to have added a plethora of other video sites since I last checked.

-- Indymedia: the fanfiction.net of journalism.
[ Parent ]
miro (none / 0) (#27)
by Sacrifice on Thu Jan 03, 2008 at 05:37:47 PM EST

Miro (http://www.getmiro.com/) also knows how to save video from the major sites.  I've not tried Keepvid, or Orbit (which was mentioned in the article).

[ Parent ]
How about this (3.00 / 6) (#7)
by Verbophobe on Wed Jan 02, 2008 at 03:58:51 AM EST

  • Watch video
  • Open up your browser's cache
  • Run "file * | grep flash" on the whole directory
  • Play each of them in vlc or mplayer or god knows what until you find your desired video
  • Save to your porn folder

Somewhat simpler than packet sniffing, and step 3 is optional, as you can go by file size/creation date instead.



Proud member of the Canadian Broadcorping Castration
Doesn't work with all browsers (none / 0) (#24)
by smallstepforman on Thu Jan 03, 2008 at 01:24:10 AM EST

Well, it works under Opera, and I frankly dont care about the others, but just thought I'd let others know that it's extremelly simple in Opera.

[ Parent ]
Pushing this to vote (none / 1) (#10)
by mybostinks on Wed Jan 02, 2008 at 01:23:07 PM EST

so that it will dump in short order.

I REALLY wish 'they' would fix the Cancel Submission control.

OMG Piracy! (none / 1) (#13)
by sausalito on Wed Jan 02, 2008 at 03:16:21 PM EST

Hence FP.

Actually, I'm not sure it is. Anybody's bothered reading YouTube TOS?
_____________

GBH - "The whole point is that the App Store acts as a firewall between busy soccer moms and goatse links"

I wondered about this... (none / 1) (#14)
by mybostinks on Wed Jan 02, 2008 at 03:26:41 PM EST

but unlike p2p downloading it would be hard to entrap a downloader. A good deal of uploaded videos contain copyrighted material. I should have put a disclaimer in the article somewhere.

[ Parent ]
It is (3.00 / 3) (#15)
by sausalito on Wed Jan 02, 2008 at 03:33:59 PM EST

Article 4. of YouTube TOS:

... C. You agree not to access User Submissions (defined below) or YouTube Content through any technology or means other than the video playback pages of the Website itself, the YouTube Embeddable Player, or other explicitly authorized means YouTube may designate.

_____________

GBH - "The whole point is that the App Store acts as a firewall between busy soccer moms and goatse links"
[ Parent ]

It would interesting to find out if (none / 1) (#16)
by mybostinks on Wed Jan 02, 2008 at 03:41:43 PM EST

they have ever enforced this and how would they know if someone downloaded a video as a opposed to finding it in your cache as a result of accessing the video with a browser.

It is trivial to make any downloading utility such as wget and cURL to look like a browser. UserAgent in the request header is easy to fake.

[ Parent ]

A way to do it would be (none / 1) (#17)
by sausalito on Wed Jan 02, 2008 at 04:02:41 PM EST

to embed some sort of one-time encryption key that is exchanged between the server and the browser everytime the video is played. But this would require some sort of specialised YouTube player in the browser, which would be awkward and could be reverse engineered.

In the end, either you go all the way towards a Sony rootkit type of thing or you're powerless.

_____________

GBH - "The whole point is that the App Store acts as a firewall between busy soccer moms and goatse links"
[ Parent ]

not really no, or maybe (none / 0) (#36)
by kromagg on Wed Jan 09, 2008 at 03:28:37 PM EST

Piracy typically refers to redistributing copyrighted works without a license (copyright mainly concerns distribution). Here you are just converting the work from one type to another, so it's not piracy in the old definition

The DMCA however prohibits circumventing DRM. Now you could definitely see some lawyer arguing that the embedded video player is actually a form of DRM and getting through to the content breaks this DRM. Well it sounds ridiculous typing this, but lawyers know their stuff. :-)

This just of the top of my head, copyright law is weird and extensions such as the DMCA even weirder.

[ Parent ]

I use keepvid (.com) (3.00 / 2) (#19)
by xC0000005 on Wed Jan 02, 2008 at 06:05:08 PM EST

because, well, I'm lazy and it works. It's always fun to do it the "hard" way though.

Voice of the Hive - Beekeeping and Bees for those who don't
keepvid is nice but (none / 1) (#20)
by mybostinks on Wed Jan 02, 2008 at 06:15:06 PM EST

there are some sites it does not work on. For example, the redtube site.

[ Parent ]
Snore (3.00 / 2) (#25)
by BJH on Thu Jan 03, 2008 at 07:01:17 AM EST

I use youtube-dl myself.

What was the point of that shit at the start about SSH and SSL? You could have just started from the paragraph that began "With youtube.com...".
--
Roses are red, violets are blue.
I'm schizophrenic, and so am I.
-- Oscar Levant

Sometimes you need cookies or password for wget. (none / 0) (#26)
by claes on Thu Jan 03, 2008 at 03:41:54 PM EST

  1. Login with browser.
  2. Find cookies file.
    cd; find .mozilla -name cookies.txt
    There may be a bunch, so you might have to try different ones.
  3. use
    wget --load-cookies (wherever)/cookies.txt ...
    to download.
  4. wget also accepts --user=username and --password=password (wierd about the =);
Check wget man page for details.

Good article.

Much easier way (with Firefox) (3.00 / 4) (#28)
by Sashazur on Thu Jan 03, 2008 at 07:17:12 PM EST

Do these steps, no software needed (except Firefox)
  • Tools > Clear Private Data (this is optional but reduces what you need to search later)
  • Watch the whole video (or at least watch it until it is all downloaded)
  • Go to the URL about:cache?device=disk
  • Search for .flv (there may be more than one if you watched multiple videos)
  • When you find a match, click the link
  • On the next page, right-click the link and Save Target As...
All done.
(To watch the .flv file, if you have Windows, you will either need to a flash video player or the K-Lite codec pack)


Clarification (none / 0) (#29)
by Sashazur on Thu Jan 03, 2008 at 07:20:45 PM EST

BTW the above works on Windows w/YouTube and some unmentionable sites, haven't tried it on any other OS or other sites.

[ Parent ]
Unmentionable As In YouPorn? $ (none / 0) (#37)
by icastel on Mon Jan 14, 2008 at 01:40:50 PM EST




-- I like my land flat --
[ Parent ]
You don't need a PhD in comp sci... (none / 1) (#30)
by a boy and his bike on Thu Jan 03, 2008 at 08:44:40 PM EST

Just go here
http://www.ripzor.com/
Then get ffmpeg to convert the flv to mpg.

Or just get the ffmpeg codec (none / 0) (#31)
by ksandstr on Fri Jan 04, 2008 at 07:15:29 AM EST

It can decode .flv on the fly. No information loss due to re-encoding and the original filesize is kept.

Fin.
[ Parent ]
Maybe you're right; (none / 0) (#32)
by a boy and his bike on Fri Jan 04, 2008 at 10:42:22 PM EST

but my machine is too slow to decode flv on the fly, but hard drive space is plentiful. Actually it can play them back but the cpu is pegged at 100% and it skips frames.

[ Parent ]
How slow, exactly? (none / 0) (#33)
by ksandstr on Sat Jan 05, 2008 at 04:45:24 AM EST

I just tested on a 400-mhz mobile P2, and there's little issue with smooth replay. Do you have an issue with video drivers, e.g. YUV overlay not working properly? Or perhaps some needless postprocessing filters are left on by default? (a deinterlace filter can really suck the memory bandwidth.)

Fin.
[ Parent ]
an other way is to do it with swfdec (none / 0) (#34)
by giorgosg on Sat Jan 05, 2008 at 05:47:15 AM EST

get rid of flash on linux and install swfdec-mozilla. Most flash will not work and videos when they do work will not have sound on the browser but you can right click and see a list of the media a flash app pulls. You can just choose the flv and save it.

TOPICAL COMMENT (none / 0) (#35)
by GhostOfTiber on Sat Jan 05, 2008 at 06:28:49 PM EST

HOW DO I PLAY .RAD FILES I'VE STOLEN FROM RHAPSODY?

[Nimey's] wife's ass is my cocksheath. - undermyne

Retrieving Flash Videos from the Internet: The Hard Way | 37 comments (30 topical, 7 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!