Hi. I'm running a newspaper website on MT5, and we currently have about 22,000 entries on our main blog, dating back to 2007. We're about to import the rest of the archives we have available, but that's around 80,000 more stories. I realize that's a ton, but it would be really helpful to have them in MT so that we can use MT Search and so that they can be published with the new site interface.
Is importing this many entries going to ridiculously slow down MT? The first 22,000 slowed us down a bit, but not too bad. I'm thinking I will probably import them into a separate "archives" blog, so that they won't ever need to be republished with the rest of the articles on the site. Will that be enough to keep performance reasonable, or is MT going to choke on having 100,000+ entries (even if spread about different blogs).
Thank you!
Reported on Movable Type 5

No, but you will need to be very mindful of your publishing settings and caching approach. I have a very complicated theme for my blog. It uses a lot of widgets and includes, so here's how I've gotten around it:
1) Identify the most minimal set of templates that need to be immediately published. For a typical blog this is the main index, entry and page templates only. Everything else, shove it into the Publish Queue. Feeds, other indexes, entry listings, etc.
2) Every module or widget that doesn't change between pages should be cached.
3) Open up your templates and identify modules that need to be cached, but can't be cached for the entire blog. An "entry details" module and comment form are a good example. Use this method:
That would cache the entry details module, for that particular page, for six months at a time. ttl is time in seconds.
On a related note, if you are going to have a very, very large data set, you might want to invest in Solr to power your search system. It's a free, Java-based search tool that you can put on your server inside Apache Tomcat (I'm assuming you have dedicated server(s) here). It's very fast. It doesn't provide a search interface, but it provides a HTTP interface for querying and returns very simple XML documents that a JavaScript developer should have no problem wrapping on a custom-written search form with jQuery.
Mike,
Thanks for the response. It sounds like MT itself doesn't have too much of a problem with having a huge amount of entries, which was more what I was worried about than the republishing. I didn't know that caching trick for caching specific module includes though - VERY useful.
As for the search, we're actually running a Windows IIS server over here (not by choice, but it's not too bad), so I'm not sure if Solr would work for us - I'm mostly hoping mt-search.cgi can handle parsing through so many entries.
Solr is a Java app. It requires Apache Tomcat or Jetty. If you control the server, you can install that for free and it doesn't require that much memory (and will scale a lot better than any search script written in PHP or Perl).
Sorry, I'm not sure if the fastsearch plugin will work on MT 5. It may, but I don't think it formerly supports it.
Rich
For searching, I think you are better off using google as your search provider. You can set up a custom google search page for free if you allow ads to be shown on the search result page. You can avoid all search load on your servers...let google handle it.
If that is not an option, then I would use the fastsearch plugin found at: http://mt-hacks.com/fastsearch.html
instead of the native mt-search.cgi.
I found that to be much better, as mt-search.cgi kept bringing our site down if it got any significant use.
Thanks for the reply Mike. I have the same issue with our domain service.