WinSieve requires no installation: just un-zip it and launch it.
The first 5 buttons enable/disable the sieving on that specific browser.
(eg: you might normally want to only sieve the browser you use the most)
If a browser can't be found on your computer, its button will be "turned off".
See how to install more browsers.
Following up, you find "settings", letting you adjust WinSieve exactly to your needs.
You're already familiar with "help", for it likely brought you here.
But your favourite button is definitively gonna be the last one: the one which starts the whole process.
Notice how it's conveniently pre-focused: so you can just launch WinSieve and hit ENTER to go.
WinSieve auto-configures, so you probably won't ever really need to tamper with the settings. However, especially on your first start, you might want to check out..
Click the bottom margin of WinSieve (the one with the tiny arrow down), and the main "drawer" will pull open: here you can choose "a la carte" the kind of stuff you want to retrieve, how accurately, and what to do with it. The usage of the "drawer" is supposed to be absolutely intuitive, just a few highlights to clue-you in:
No need to get any more specific than this, right now. It's finally time to get started with..
It basically consists of four steps, obviously repeated for each selected browser:
..which by default is called NewCache, and is placed on your desktop. (Or wherever you wish.)
The data containted there is normally grouped by browser, and is stored in sub-folders according to its type (eg: documents, media, programs, fonts, etc.) and cathegory (eg: all images are shunted in banners, graphic elements, thumbnails, small pictures, etc.).
When the sieving is done, you can close WinSieve. And perhaps clear the internet cache from your PC, not only for safety/privacy reasons.
An overgrown cache may slow down the performance, not to mention the total waste of time it would be to keep re-sieving previously acquired data. Furthermore, when a certain amount of cache is reached, the browser automatically erases older portions of it in order to make room for new data: that's one more reason to regularily use WinSieve and then emptying the cache.
And finally, bulk-copy the NewCache folder to your USB-key, for "harvesting" it later at home. Or open it right now, and check out what WinSieve has fetched for you.
It couldn't be any easier than that! Just move out of that sieved-data folder whatever you're interested to keep, then trash it.
You could also keep it there, but it's not recommended: sieving after sieving, NewCache can easily grow up to be quite a large "no man's land" - if you know what I mean.
Oh, by the way: you can interrupt any step of the sieving anytime by closing the progressbar window.
If you reply NO, the process will resume.
(So that's also a handy way to pause, should you ever need to.)
If you say YES, the sieving will stop.
PS: thanks to Windows file-caching, if you then start another sieving the process will be much faster (until you reach back the point when you aborted).
The first tab tells WinSieve where the internet cache of each browser is located.
This normally auto-configures, but you're free to browse using the ".." buttons (or typing-in another path).
It also lists the folders whose data must be excluded.
eg: if a picture is already in "My Pictures", it will be skipped
"Extract to" indicates where the sieved data shall be stored.
If the folder does not exist, it will be created at each sieving. Otherwise, new data will be appended to it.
In that case, I recommend to let WinSieve "Skip duplicates from other browsers too".
eg: if "NewCache\Internet Explorer\Icons" already has an icon found in FireFox, no need to store it also in "NewCache\FireFox\Icons"
After sieving ("once done") you can have WinSieve automatically pop-up the Internet Options, and/or open the NewCache folder, and/or.. take leave.
The second tab instructs WinSieve about what a thumbnail or a banner is, when and how files should be renamed, etc.
eg: if an image is at least 20 pixels large and 50 pixels tall, and smaller than 100x80 pixels, and does not weight more than 4000 bytes, then it's a thumbnail.
Also if its name begins with "THUMS" or ends with "_T" or has "_TH" anywhere in its name, but only if it's a JPG.
These parameters are (supposedly ) pretty intuitive, so just change them and see what happens - at your own risk. :-7
Or better still: take a look at the techie stuff to learn more about each of them.
The third tab enables/disables the advanced features of WinSieve, and also allows you to tweak them.
The other parameters will become clear as you later-on learn of the report and the regrouping feature.
The fourth tab lets you adjust the type (and accuracy) of the file-frisking feature.
Whenever you start WinSieve, you always find it exactly as you had configured it.
However you might want to use the program on another computer, keeping the same settings. Or maybe you need to reinstall/upgrade Windows, and those data would be lost. In those cases, just save your settings: when you copy WinSieve's program folder, you'll bring'em along with you.. ready to be loaded back.
Reset obviously restores the default-settings, ie: those WinSieve had before you messed around with them. :-7
If you click the right margin of WinSieve (the one with the tiny arrow right), the extra-features' "drawer" will pull open:
If you click the "drag'n'drop" button, or (as its name suggests) if you drop files and/or folders onto it, WinSieve will turn into this:
..although likely larger than that, and placed at the top of the screen for your convenience.
The easiest way to go drag'n'dropping is to open windows below and then drag their content to the drop-area above.
However you can place it anywhere and resize it as you wish, also minimize it or go full-screen
This is where you can batch-process multiple files and folders: simply drag'n'drop them to the textbox "with the bouncing double-icon", or type-in their full path. Then select the desired function. (Or close the window to go back to WinSieve.)
BATCH TRASH DUPLICATES [LEARN MORE]
It's the same as the right-clicking option, but on a wider scale: here you can find duplicate files across different folders, and/or single files.
BATCH FILE-FRISKING [LEARN MORE]
The same as the the right-clicking option, but performed on multiple files and/or all files contained in any folders.
The embeddings will be extracted in a folder called "Extracted", by default on your desktop - but you can browse for another path with the ".." button.
No separate embedding-extraction logs will be generated, but you can still find all the details in the main log.
RE-GROUPING [LEARN MORE]
Copy (or move) a bunch of scattered data to separate folders containing up to N files (and/or folders) each.
The re-grouped data will be stored in a folder called "Grouped", by default on your desktop - but you can browse for another path with the ".." button.
CUSTOM SIEVING [LEARN MORE]
To extract your chosen types of data from anywhere on your PC, and have it organized WinSieve's way.
PS: in order to keep the batch operations steady, very large files won't be processed for embeddings extraction and batch-sieving.
By default the limit is 100MB, but you can reduce that - or increase it.. at your own risk. ;-)
The trashing of duplicate files, and the re-groupings, will be performed regardless of the filesize.
"MAIN DRAWER" OPTIONS
Basically, this extends to other folders the search for duplicates - preventing you from sieving stuff you already have: either anywhere on your computer (eg: images in your "my pictures" folder) or from previous sievings (ie: if you didn't remove the previous NewCache) or from other browsers' cache.
However adding too many exclusions may slow down the overall performance a lot, and if too many files are excluded WInSieve might malfunction.
It's basically a snapshot of the usage informations of your PC. It can be created on-demand on the desktop (by clicking the button inside the small "drawer") or automatically at the end of each sieving (it gets stored in WinSieve's program folder, along with the sieving log).
You can separately select what to report, so it's easy to turn it into a handy digest of all your favourites (by deselecting all other options than "favorites"), a list of the documents you've been working on, statistics about the programs you use the most.. or all of them.
Also none of them - but of course when all the options are unchecked, the report will be disabled.
Detailed report also gets you a list of recent DLLs and system programs (no longer sorted by program name, but by access date - with all following runs of the same program listed too). And date & time of access to recent files, and that of the adding of each favorite.
Can't determine date & time of recent programs and history, sorry.
QUICK-ERASING OF THE CACHE (Internet Explorer only)
If you answered YES by mistake you can still restore it, or permanently erase the cache by emptying the recycle bin.
That should be enough - however, to make sure no glitch will ever happen, perform anyway a clear-up from the Internet Options.
When the shell-integration is enabled, upon right-clicking on a folder you'll be offered the option to "Trash duplicates" (or whatever caption you chose in the settings). This allows you to take advantage of WinSieve's powerful search for duplicate files for doing a little "household chores" on your computer.
Being not "mere internet cache", however, the quickmode will be disabled - to guarantee the maximum accuracy.
(However you can always cheat by setting the large-file limit to zero. ;-)
If any duplicate is found, you'll be prompted to move them to the Recycle bin - so you can always undo, or conveniently restore a few to their original position.
A logfile will be generated, also, listing all files and their matching.
(eg: c:\myfile.txt = d:\myfolder\anothernameformyfile.log)
Drag the log out of the bin and read it, or sort Recycled files by time of deletion and.. voilą: the logfile acts as separtor from previously deleted stuff!
(ie: all removed duplicate files follow it, and whatever precedes it was not trashed by WinSieve)
PS: it's not recommended to examine a whole hard-disk, nor too many files. Use specific hard-disk maintenance programs to do that, instead.
You might want to make a distribution-list of all the email-addresses found in a textfile (or group of files/folders, using the batch processing feature). Or maybe trace back the web links you mentioned in a letter. Or find a certain phonenumber in a bunch of email backups. Or the activation code of a program.
Also: you might have an unknown file (or one labelled with the wrong type) and you wish to restore its proper file-type.
(eg: dragging on the desktop an image off your browser, it got saved as HTML instead of JPG)
Or: a file is marked "system" - and whenever you try to copy or move it, you're asked a boring prompt of confirmation.
(eg: folders' hidden album pictures, which cannot even be un-hidden, or the thumbs.db)
WinSieve can take care of all that. And while it's frisking, it can also..
Some files (Powerpoints, Word documents, thumbs.db, Visual Basic forms and many more) may contain images or sounds, and possibly other embedded stuff.
WinSieve can inspect their contents with its powerful filetype-recognition system, and attempt to extract any embedded data as separate stand-alone files.
Some of them may have glitches (ie: a sound may be truncated, or end with a few jiffies of white noise), however I'm already thinking of workarounds to better determine the proper end of the files. And GIFs and JPGs already get trimmed down to their correct size.
WinSieve is in no way an alternative to WinZIP or WinRAR, nor to the programs that generated those documents. As a matter of fact sometimes those embeddings are compressed, or splitted in blocks, making them only partially extractable or not readable at all, but.. hey, you can't blame the picklock if it breaks the door, or if it can't open a safe!
Also, fact it looks like some type of file doesn't necessarily mean it is that type of file: recognition is based on file headers (up to 16 bytes), so "false positives" may well happen. That is why by default only certain types of files will be extracted - but as always you can change that to suit your needs.
RE-GROUPING FILES AND/OR FOLDERS
You could do this yourself, sure, but WinSieve is faster and just as accurate as you'd be. Here's an example of regrouping by 3:
OR |
By default the original files and folders remain where they are, and a re-grouped copy of them is generated.
Yet you can choose to reorganize them permanently by moving them instead of copying them.
There's no undo to this, so be careful what you're doing.
In case of lengthy files, that would take too long (or too much disk space) to be moved to another unit, have them regrouped to the same drive.
If you move C:\folder1\mytext.doc to C:\folder2\folder3\, only the directory link will be updated. But when you do that cross-unit (eg: if you move C:\mytext.doc to D:\mytexts\) the whole file needs to be copied there, and then deleted from the source drive.
OR |
PS: when files are fewer than the limit per folder, no folder is generated - as it would be pointless to have a <Grouped><1> folder without a <Grouped><2>.
The examples above are just meant to clarify how the grouping is performed, but actually myPic1.jpg would be grouped as C:\Grouped\myPic1.jpg instead of C:\Grouped\1\myPic1.jpg
I realize this all may sound terribly confusing, however worry not: it's much easier than it looks, and practice will make things clear in a while.
If a filesize-limit is specified (eg: 4000 bytes), larger files won't be considered thumbnails.
This is useful, for example, to prevent a picture called "Smallville" from being recognized as thumbnail just because it begins with "small".
The standard measure for a banner used to be 468x60, that is: almost 8 times wider than tall. However today most sites only allow smaller room for banners, and the width/height proportion may go as low as 3 - which in fact is WinSieve's default value.
Also, vertical banners were introduced - which is why WinSieve not only checks the X/Y proportions, but also the Y/X.
The minimum height is used to tell the difference between a banner and a large button: if you use lower values, probably some graphic elements will be recognized as banners. And viceversa: if you use higher values, some banners will be recognized as graphic elements.
(eg: to recognize large buttons as banners, set minimum height to 0.)
The maximum height is used to decide wether the graphic object lies within the possible measures of a banner (by default, up to 180 pixels tall) or is another type of image - for example, a panoramic photo (much larger than tall).
(eg: to consider panoramic photos and page headers as banners, set maximum height to 0.)
Smaller images are probably of no interest (eg: large buttons, or .GIF animations), so they're sieved separately.
They get recognized by the global amount of pixels, like with digital cameras' MegaPixels, so entering 320x240 or 160x480 produces the same comparison value of 76.8 kiloPixels.
As usual, the "only JPG" option will prevent images of other graphic formats than JPG from being recognized as small pictures.
Non-photoquality images are likely to be huge backgrounds, so non-JPG full-size pics are normally considered graphic elements.
Then there are the small files: below 70 bytes, either it's crappy text or a block of solid color used as background, or maybe a tiny monocromatic icon.
If you wanna keep them, however, just set the limit to 0 bytes and none will be skipped. To ignore most graphic elements and thumbnails, instead, increase it to 1500 or more.
Likewise, you'll be glad to be spared of plenty of SWF junk (banners or empty players).. But still, with WinSieve, the final choice is always up to you.
ie: set the limit to 0 to keep them all.
Files are first sorted by their size, and then the exact same-size files are compared by three sample-strips of 1kB (ie: 1kB at the beginning, 1kB from the end, and the 1kB in the middle of the file). No matter how large a file is, statistically it's virtually impossible that two files of the same size match in three points - yet having to read just 3kB per file makes the matching amazingly faster.
Only in the event that all three sequences are identical (or the files are smaller than 1kB), a byte by byte comparison will be started - until a difference is found. And if it's not, they're obviously the same file beyond any doubt.
From v5.1, byte by byte comparison is only performed on files smaller than 2MB. Larger files are checked further with 3 random sequences.
(ie: if the top-middle-bottom sequences match, three more sequences - this time taken at random points of the files - are compared, instead of the whole file.)
If you can tolerate a tiny margin of doubt, DO enable the quickmode: byte by byte comparison will only be performed on files smaller than 1kB, and other files will be considered equal as long as the respective three sequences match.
I also recommend to set the large-file limit, which will force the quickmode on big files. This way you can enjoy the maximum accuracy of the non-quickmode, and yet never get stuck on huge files.
It's the textfile stored in NewCache where WinSieve notes down all the details of all the operations performed.
If you want to know why a file was skipped, or whose duplicate it was, or its image size, or its file header*, or its original URL**, or which fragments was the embedding rebuilt for, or whatever piece of information else.. that's the place to look.
*if it could not be recognized **not available for Internet Explorer
It also tells the time when the sieving was started and completed, the paths and parameters used, and provides some basic performance statistics aswell.
Here is how each output data folder is structured:
| or: | ..with each sub-folder structured as an independent NewCache. |
|
Full-sized images | .JPG | .JPEG | .JPE | .JIF | .JFIF | .GIF | .PNG | .BMP | .PCX | .WMF | .TIF | .TIFF | .XBM | .PIC | .AI | .BW | .CLP | .CT | .CUT | .DIB | .EMF | .EPS | .FPX | .GEM | .IFF | .IMG | .JSL | .LBM | .MAC | .MSP | .PBM | .PCT | .PFR | .PGM | .PPM | .PS | .PSD | .PSP | .RAS | .RAW | .RGB | .RGBA | .RLE | .SCT | .SGI | .TUB | .WPG | .TGA | |||
|
Graphic elements (buttons, bullets, icons, etc.) | ||||
|
Thumbnails (picture miniatures) | MINI* | S_* | SM* | SMALL-* | TH_* | THUMBPHOTO* | THUMS* | TM-* | TMB_* | TN* | TN2_* | TNTHUMB* | *_SM | *_SMALL | *_T | *_TH | *_THN | *_TH* | *_TN* | *THUMB* | |||
|
Banners (advertising strips of graphic) | ||||
|
Smaller images and/or big graphic elements | ||||
|
Corrupted images, or images of unknown size | ||||
|
Junk images (images of predefined size) | 108x87 | 115x115 | 150x150 | 35x* | 55x* | 60x* | 65x* | 75x* | 90x* | 100x* | 110x* | 180x* | *x40 | *x45 | *x90 | *x100 | *x110 | *x120 | |||
|
HTML pages | .HTM | .HTML | .SHTML | .XHTML | .XTML | .XML | .HTC | .HMT | .MHTML | .ASP | .CFM | .PHTML | .HTT | .HTA | |||
|
Java applets, JavaScripts and CGI scripts. | .CCS | .CSS | .JS | .CLA | .CLASS | |||
|
Fonts | .TTF | .OTF | .FON | |||
|
Icons and cursors | .ICO | .CUR | .ANI | .ICL | |||
|
Multimedia files (audio, video, MIDI..) | .AVI | .DVX | .DIVX | .MPG | .MPEG | .M4V | .MPE | .MPEGA | .MOV | .WMV | .RM | .SWF | .FLV | .MID | .MP3 | .WAV | .AU | .RA | .RAM | .KAR | .SPL | .SID | |||
|
Programs and archives | .ZIP | .RAR | .LHA | .CAB | .JAR | .TAR | .LZH | .TGZ | .SIT | .EXE | .COM | .DLL | .DRV | .INS | .THEME | .VBS | .PP | .PAS | .C | .VBP | .VBW | .BAS | .FRM | .FRX | |||
|
Text files and other documents | .TXT | .RTF | .ODT | .ODP | .ODB | .ODS | .ODG | .DOC | .PPS | .PPT | .XLS | .MDB | .PDF | .MDB | .LOG | .INI | .NFO | .DIZ | .HLP | .CHM | .PDF | .PPT | .CSV | .PUB | |||
|
All that didn't match the sieving criteria. | (WON'T BE CREATED IF THE TRASHBIN IS DISABLED) | |||
|
Files that could not be recognized | (WON'T BE CREATED IF "KEEP FILES OF UNKNOWN FORMAT" IS DISABLES) |
PS: to enable the quick-erasing of the cache: click "settings", tab "folders", then for each browser select "trash" to move it to the Recycle bin (so you can restore it in case of need), "wipe" to permanently delete it (be careful: there's no un-do to this), "ask" to be prompted every time (first you're asked if you wish to wipe it permanently; answer no and you'll be asked if you wish to trash it), or select "keep" to leave it as it is.
Click here to read the complete history of the previous versions.
While you surf the internet, your browser writes into the so-called internet cache each single page that you visit: text, images, multimedia files, the programs you downloaded.. In this way, the next time you open that page, the computer won't have to reload it allover: just the elements that have changed since your last visit. (eg: if a picture was added or modified.)
This makes the browsing much quicker, and also allows you to view the pages off-line, but also means a potential risk for your privacy: in fact, anybody could see not only which sites you visited (peeking inside the history), but also examine the exact pages you viewed. This can get to be pretty embarassing with erotic sites, but turns out to be even dangerous if you visited the site if your bank or if you read your email on a site. In a few words, your privacy is at risk with every page containing your bank accounts, credit-card IDs, phone-numbers / address, etc.
Not to mention all the stuff that is being downloaded without you knowing that (such as the hidden pop-ups), or without you asking for it (eg: advertising banners of erotic sites, even full porn images)..
The only way round this nuisance, is to erase the history and the cache every time before you close the browser - and make sure that you haven't added (maybe without even realizing that you were doing so) some undesired bookmark to your "favourites". In this way, you'll remove every trace (well, at least those that can be removed without reformatting the PC and/or being an hacker) - but you'll also lose the chance to view offline the things you had downloaded (unless you've already saved them elsewhere).
Not to mention the fact that many things cannot be saved: some sites (bastards!) disable the right-click (so you can no longer "Save image as.." etc.), and it's not always possible to save the whole page (File|Save as|Complete webpage). And sometimes the browser "forgets" to copy some mandatory components (eg: some JAVA classes), and as a result the page you've saved does not work.
Sometimes (due to the imprecision of an automated server-side procedure, or with the precise purpose to cheat expert users who might want to check out the temporary internet files) it also occours that some contents are masked, unrecognizable because they're indicated with a different filetype than the one that would be due (eg: images saved as HTML or text pages, or viceversa).
Well then: WinSieve is the definitive answer to these (and many more :-) nuisances!
HOW TO REMOVE ALL TRACES FROM YOUR COMPUTER
Once gathered everything you want to keep and secured it elsewhere, it's a good rule for preserving your privacy to permanently remove the temporary internet files and clear the browsing history - so that nobody else will be able to access off-line the pages and the URLs of the sites you've been. (further details here)
Each browser has its own way to do that:
(The screenshots are in Italian, however they're plain to understand by comparison with those your browser will show you.)
Click "settings" and go to the folders tab, then click the icon of the browser you wish to install: its homepage will pop-up, allowing you to download the installer.
Try out your new browser, and it'll be automatically recognized and configured the next time you start WinSieve.
If it doesn't, you can always try browsing with "..".
WinSieve is freeware: you're free to use it without paying me - however donations are highly appreciated.
You are also free to distribute it, as long as the following conditions are respected:
For any other use, get in touch with me. All rights reserved.
(That is to say, you're not allowed to steal the idea. But if you wish to port WinSieve to Linux/Mac or other platforms, we can talk about it.)
You're also welcome to inform me of program-bugs, or if you have any idea about how to improve it.
PS In order to reward free software supporters, I had to implement a (puny) nag-system for non-donors. Now, after having fetched 700MB of sieved internet cache and/or having tried out the special functions for more than 5 times each, thus proving you enjoy using my program on a rather regular basis, you'll be reminded (although never obliged) to support my work. Fair is fair, after all - and the basic sieving functions always remain available for everyone.
LEGAL DISCLAIMER WinSieve comes "as it is", with no implicit or explicit guarantee whatsoever. You agree to use it at your own risk. (If any.) Under no circumstances will I be held responsible for the use that you make of it. Some sites feature copyrighted material, and to copy or use it without a written authorization of the owner may be a violation of international laws. Also, WinSieve is in no way meant to spy on people: you're supposed to only extract the contents of your own Internet cache, or you might be prosecuted on the charge of stalking. |