Beam Us Up Crawler Updated v1.2.1 – Big Feature Update

B

Download WindowsMac or Linux

Remember you need Java installed for Mac (how to run on mac guide) & Linux. As well as to uninstall any previous version.

Need Support?

Reply to a comment with your real email address in this post and I’ll get back to you.

Whats in the update?

So we’ve been busy working on updates to Beam Us Up crawler the updates are many many many!

Crawls are automatically saved

Crawling is saved periodically during a crawl.

If you crawl too many URLs and it freezes then and you restart the program it’ll load up from where you left off and you can complete it or delete it.

Right Click Copy

Lots of options to copy the tables.

Improvements in error filters

Color coded filters and cleaning up of the buttons

In and Outside of Sitemaps

See if urls you are seeing are inside or outside of sitemaps.

Easier to see in links and out links from a page

URL Search

Dark Mode

Bug Fixes and Minor Improvements

A lot of bug fixes and minor improvements!

16 Comments

  • I tried to open the app in Mac with no success. I have the latest Java installed, but there is an error that doesn’t allow the app to open.

      • The error I am guessing is that it does not open. This is because of Mac’s overly aggressive “security”/wanting developers to pay them money. We have applied to Apple and we are currently trying to sign the DMG file to allow it to open easily. However it is a bit of a pain especially when we ourselves are not Mac developers.

  • Hello,

    This is a nice upgrade!

    I’m trying to crawl a big web (in development), so I’m getting two errors.
    The first one:
    Exception in thread “AWT-EventQueue-0” java.lang.ArrayIndexOutOfBoundsException: Index 800 out of bounds for length 800
    at java.desktop/javax.swing.DefaultRowSorter.setModelToViewFromViewToModel(DefaultRowSorter.java:745)
    …………..

    The second one:
    ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console…
    Finished (killed)

    Regards

    • Hi Adrian,

      Can you tell me what the operating system and website you are using is?

      (If you reply with the website here don’t worry I’ll remove it from your comments before publishing so it’ll remain private ;))

      Thanks.

      • Hello, I’m using it in Debian 12 and the website is in local with docker.
        The error comes because I got 144000 links and had not enough memory (16GB)… jeje So I think it was an error related no free memory (could not load it)

        It would be good to have a check options about the “extra” filters (because as I notice, are all data, not related). So, if I don’t want to check about the “URLs Not in Sitemap” in this round, will be faster, because there is no need to check it.

        Also, are you saving where am I comming from an URL? (as before, with a check option to do it, would be a good improvement). Sometimes is good to know which links points to “website.com/this-path”.

        Thanks for your effort!

        • ” are you saving where am I comming from an URL? ” Yes but the crawl needs to finish or you need to stop it and then it will calculate this.

          114k urls is quite a lot. I doubt there are a lot of unique types of pages so I recommend you stop the crawl before it stops.

  • Hi Gui / William!

    I really appreciate all your time and effort building this!
    But .. I’m running into a small problem with retrieving content from our webshop The problem seems to have something to do with DNS. We use LightspeedHQ as webshop platform we are pointing the domainname to their servers, so from that point on, I’m unable to diagnose further. When I enter “our url” as URL, I only receive a HTTP 301 (Moved Permanently) message. I also found a CNAME in our DNS records: “7221.shops.oururl”, but this will be re-routed. I hope you have some answers for me, because I’m very curious 🙂

    Thanks in advance!
    Hielke

  • Hello, tnx for app

    Some new futures:
    1) In Configuration check box – Do not crawl external domains
    2) And also will be good to have info about missing Alt and Title tags for images