Copy whole websites or sections locally for offline browsing

Cyotek WebCopy

Join our mailing list

Stay up to date with latest software releases, news, software discounts, deals and more.

Subscribe

Cyotek WebCopy

  -  4.04 MB  -  Freeware
  • Latest Version

    Cyotek WebCopy 1.9.1 Build 872 LATEST

  • Review by

    Daniel Leblanc

  • Operating System

    Windows Vista / Windows 7 / Windows 8 / Windows 10

  • User Rating

    Click to vote
  • Author / Product

    Cyotek Ltd. / External Link

  • Filename

    setup-cyowcopy-1.9.1.872-x86.exe

  • MD5 Checksum

    a140259ec36b24ed0e514e9c23f08ded

Cyotek WebCopy is a free tool for copying full or partial websites locally onto your harddisk for offline viewing. It will scan the specified website and download its content onto your harddisk. Links to resources such as style-sheets, images, and other pages on the website will automatically be remapped to match the local path. Using its extensive configuration you can define which parts of a website will be copied and how. This software may be used free of charge, but as with all free software, there are costs involved to develop and maintain.

What can WebCopy do?

The Web Copy Tool will examine the HTML mark-up of a website and attempt to discover all linked resources such as other pages, images, videos, file downloads - anything and everything. It will download all of these resources, and continue to search for more. In this manner, WebcCopy can "crawl" an entire website and download everything it sees in an effort to create a reasonable facsimile of the source website.

What can WebCopy not do?

It does not include a virtual DOM or any form of JavaScript parsing. If a website makes heavy use of JavaScript to operate, it is unlikely It will be able to make a true copy if it is unable to discover all of the websites due to JavaScript being used to dynamically generate links.

It does not download the raw source code of a web site, it can only download what the HTTP server returns. While it will do its best to create an offline copy of a website, advanced data-driven websites may not work as expected once they have been copied.

Features and Highlights

Rules
Rules control the scan behavior, for example excluding a section of the website. Additional options are also available such as downloading a URL to include in the copy, but not crawling it.

Forms and Passwords
Before analyzing a website, you can optionally post one or more forms, for example to login to an administration area. HTTP 401 challenge authentication is also supported, so if your website contains protected areas, you can either pre-define user names and passwords or be automatically prompted for credentials while scanning.

Viewing links
After you have analyzed your website, the Link Map Viewer allows you to view all the links found in your website, both internal and external. Filtering allows you to easily view the different links found.

Configurable
There are many settings you can make to configure how your website will be crawled, in addition to rules and forms mentioned above, you can also configure domain aliases, user agent strings, default documents and more.

Reports
After scanning a website, you can view lists of pages, errors, missing pages, media resources, and more.

Regular Expressions
Several configuration options make use of regular expressions. The built-in editor allows you to easily test expressions.

Website Diagram
View and customize a visual diagram of your website, which can also be exported to an image.

Note: Requires .NET Framework.

  • Cyotek WebCopy 1.9.1 Build 872 Screenshots

    The images below have been resized. Click on them to view the screenshots in full size.

What's new in this version:

Added:
- Added the ability to read cookies from an external file
- Added the ability to read cookies from an external file
- Test URL dialogue now allows configuring cookies
- Added cookie, cookie-jar and discard-session-cookies command line parameters (User Manual)
- Added support for the legacy compress

Changed:
- Documentation improvements
- Test URL dialogue now uses load on demand for settings pages
- 401 challenges no longer display credential dialogues unless the authentication type is either Basic or Digest as no other values have been tested due to lack of resource
- Updated mime database

Fixed:
- Posting a form did not set an appropriate content type
- Custom headers were not applied when posting forms
- If a URL was previously skipped but then included in future scans, the original skip reason could be retained
- A blank error message was displayed for Brotli decompression errors
- One-time project validation checks were ignoring the content encoding settings of the project (which by default is Gzip and Deflate) and were requesting content with Brotli compression
- Brotli decompression could fail with streams larger than 65535 bytes
- The URI transformation service incorrectly attempted to add prefixes to email addresses, this in turn caused a crash if the mailto: reference was malformed
- A crash could occur if a content type header was malformed and was either utf or utf-
- Fixed an issue where command line arguments sometimes didn't correctly process ambiguous relative arguments that could be a file name or a unqualified URI
- Fixed a crash that could occur when switching between empty virtual list views during a crawl and items were then subsequently added
- A crash which could occur when loading localised text is no longer fatal
- Speed and estimated downtime time calculations were incorrect and could cause a crash when downloading large files
- A crash would occur when editing a file that didn't have a mime type
- Speculative fix for a crash that could occur when finishing the New Project Wizard
- Fixed a crash that occurred if a 401 challenge was received and the www-authenticate header was a bare type
- If a website returns a non-standard Content-Encoding value (or one currently not supported by WebCopy), no attempt will be made to decompress the file and it will be downloaded as-is. A new setting has been added to disable this behaviour, but is currently not exposed
- Crashes that occurred when applying project validation corrections (for example if the base URL redirects, WebCopy will prompt to use the redirect version) were fatal
- Trying to save a CSV export with a relative filename crashed
- The quick scan diagram view could crash if invalid host names were detected
- The "Limit distance from base URL" setting now only applies to URLs that have a content type of text/html, e.g. it will prevent deep scanning whilst still allowing retrieval of all linked resources
- URLs that had exclusion rules would still get requested depending on the combination of project settings
- The CLI would crash if the recursive and output parameters were defined, and the specified output directory did not exist
- Client is no longer marked as dpi-aware, which should resolve pretty much all the problems with the application not displaying correctly on high DPI screens. This is an interim fix until dpi-awareness can be properly introduced.
- Fixed a crash that could occur when trying to query if the scan above root setting should be abled and an invalid URI was project
- Fixed a crash that could occur when the scan/download progress dialog was closed
- The Export CSV dialog wasn't localised correctly, resulting in seemingly two Cancel buttons

Removed:
- The PDF meta data provider has been removed

Join our mailing list

Stay up to date with latest software releases, news, software discounts, deals and more.

Subscribe