Initial download of large databases

user68 · September 17, 2025, 10:00am

We are running into an issue with the initial download of our large project database. There are multiple species associated with this project but we only want to extract detections of one of them, and I’m wondering if there’s been any progress developing a pre-download filter (by species, time period, etc.) in order to make the download more manageable. I saw this post from February where someone had a similar issue, and @jsrs at the time mentioned it was not possible. Another complicating factor is that there is a CTT node network associated with the project that is presumably generating tons of hits and vastly increasing the database size, but we do not need those detections - only hits at standalone stations. I imagine other people are increasingly running into issues with the initial download as node networks are established and study periods lengthen.

As a secondary question, is there a way to determine the size of the .motus database in advance? (We are not sure how large our final file will be since the maximum batch number during download displays “10000”, which I assume is a result of splitting the download into batches).

We got ~26 hours / 22GB into the download and then had a server timeout error, so any advice, workarounds, etc. would be most welcome! Thanks in advance :)

AdamDSmith · September 17, 2025, 1:20pm

Hi Claire, just a short response at the moment with respect to node data. I’m assuming you’re using the tagme function from the motus package, which has an argument skipNodes to skip checking for and downloading node data, which can considerably speed downloads and reduce file size in my experience. If you’ve not used it yet, then your local .motus SQLite database already has a lot of node data in it, but you can at least prevent further downloading of it.

lberrigan · September 17, 2025, 1:21pm

Hi Claire,

Unfortunately, as Josh said, there is no way to select for which data to download from a given project or receiver. That said, it’s something we have plans to introduce in the future.

In your case, I would just run tagme each time there is a server timeout. This doesn’t start the download from scratch again, rather, it starts off from the last batch that was downloaded, so it’s cumulative.

Take a look at the tagme function reference - you will see that you can skip a couple things like activity and node data (skipNodes = T).

For your second question, you can use the function tellme to determine how large the dataset is. See the function referencefor more details.

jsrs · September 17, 2025, 1:34pm

In addition to Adam’s and Lucas’s suggestions, another option you have in the short term is splitting your tags into different projects. That adds a little more management overhead for you, but it will at least allow you to download entire projects containing a smaller number of tags. You could divide them up by species, or by year, or however you want. Just use the template here to ensure we have the info we need to move the tags.

user68 · September 17, 2025, 2:10pm

Wow, thank you all so much for the tips! It’s been a while since I downloaded a new database and I clearly should have read through the help files more carefully I was unaware of the skipNodes argument, which should be a huge help going forward! @jsrs - great to know that splitting tags into different projects is an option. I just joined this research team but will talk to my PI to see if they want to go that route. Thanks again!

user68 · September 17, 2025, 2:16pm

As a follow-up, I just tried the tellme() function but first ran into a server timeout error and then, after extending srvTimeout(), got the following: “Error: Server returned error ‘Failed to count update records. Please contact Motus support.’” Maybe this is just a “try again later” problem?