I am looking for some general input on our setup. We migrated from Goldmine starting in 3Q of last year after having used for the last 14 years. We used Goldmine as both a CRM and email client. We hired a consultant to migrate the Goldmine DB into Suite using an ETL tool called Starfish. It was painful but we got through it. We migrated our contacts and their history and then also generated accounts based on the contacts. For those that don’t know, there are not Accounts in Goldmine. All told it was about 38k contacts, 12k accounts and 800k pieces of history (mostly emails). The history was moved into the Notes table in Suite. All told, the database is around 9 GB.
Our Suite setup is 7.1.4 running on IIS8.5/Windows Server 2012/PHP 5.3.29/SQL Server 2012. The Suite server seems to be powerful enough at 8 GB of RAM, 2.27 GHz processor. It’s light on available disk space though. The SQL server is 6 GB, RAID5 array with plenty of space, and a 2.4 GHz processor. The interface is fast enough - navigating through the the site is fine, creating records, editing records, etc. We’ve incorporated a number of performance optimizations that we’ve found.
My problem comes when I have to run any query that hits that Notes table (~800k records). Let’s say I pull up a contact with extensive history, I’m looking at a 15 s response time to load that contact. I can minimize the History subpanel, leave come back and the page loads in about 2 s. Open the History subpanel, trigger a call into the DB and it might take 10 s to return the result. We’ve gone so far as to take the query generated by Suite and run it directly through SQL administrator. This process still takes time. All indications are pointing to any time I query that Notes table, it will take time. If you monitor the CPU when loading up a contact and their History, it approaches 90% utilization.
I’m thinking about a couple options and am soliciting opinions from the boards.
- Do I simply throw more horsepower at this - another processor?
- Do I create a custom module, called Archive, for example, and then import the Notes table into that module and let people go looking for older items in the Archive?
- Do I move to a SS drive for faster reads/writes but lose the RAID capability? We still do nightly backups but we’re likely to lose the transaction logs.
- Do I move to a different setup where a different DB engine might give me a boost? This is unlikely unless you think I would see a HUGE improvement.
Thanks for your time.
What’s the speed between your web server and database server?
Is it all phisical? nothing virtual?
Why would you lose RAID is you use SSD?
The Archive module is not a bad idea
Our internal network is gigabit. Everything is hosted internally.
We have four physical boxes that we then build our virtual servers off of those. SQL server is on a physically different box than the Suite server.
We would lose RAID going to SSD until we determined if that helped a lot. At that point, we would have to go back out and get two more drives.
Thanks for the reply.
I don’t have experience with such amount of data, maybe someone will give you hints regarding big Sugar/Suite Databases. I hope that you succeed optimizing your setup.
I have mostly the same setup, except I run Linux webserver via FreeTDS to MSSQL.
I just finished this, if you’re willing to test it. (~/include/database/MssqlManager.php)
In my case, using the Global Search (standard SugarCRM) would choke on the Contacts module. My changes to “Select TOP x *” reduced my search times from 29s to <1s, but obviously I’d like to see how it affects other use cases.
My diff is here (github) if you want to apply yourself.
Edit x2: Can’t attach files, but you can pull it from github.
Thank you for this valuable piece of code. I pulled the text into Beyond Compare and then pushed it into msqlmanager.php (after making a backup of course :)).
So far, I have seen a reduction in the response time - pulling up contacts has reduced the time by up 50% - quite an improvement! What is quite interesting is that my Global Search is now returning results that will allow my users to start using it once again. For my development machine, I had turned the FastCGI time up to 2 minutes in order to return a result when the Contacts module was being searched. Searching for only an Opportunity or only a Task (in that individual module) would return a result in only a second or two but searching the Contacts module would take around 1:40. Now I am returning Contacts or Accounts in only 2-4 s. That in itself is magnificent!!! Nice job, klou.
Great! Put it through it’s paces. If there aren’t any major issues, then I’ll submit a PR tomorrow.
What other performance tweaks are you doing?
Typos on lines 466,475, 492.
Add a space after '@topcount=$count ’ so that it lies within the double-quotes.
I updated the code to include the extra spaces now too although the lacks of spaces didn’t appear to be detrimental. We threw three more processors at the SQL server and increased the RAM a bit more. With all of our users on it today, the performance gains aren’t quite what I had seen last night during testing.
We did move to a SSD today but that did not appear to increase performance.
I keep coming back to the Notes table as the bottleneck. I wonder if anybody else out there has as much History (combination of Notes and Emails) as we do and has better performance.
I found it as it was being tagged as a slow query, but now I think that the spaces are due to some trimming and such as the query is being built.
What’s the query? Maybe there’s something you can do there.
I wanted to share some more optimization that I came across recently. My testing was on our development machine and the code you originally game me converting the queries to a DECLARE statement helped a lot there for the Global Search. Upon putting this into production, I didn’t see quite the same performance improvement. My DBA and I dug a bit and found that by adding an ‘ORDER BY id’ in several places greatly improved the Global Search on our production machine. Lines 460, 555, 564, 576, and 584 all read:
WHERE row_number > $start ORDER BY id";
That’s interesting. Adding additional processing (sort operation) somehow improves the performance?
Did you make other changes? The line numbers don’t seem to match up for me.
No other changes were made. One thing that has come up though is that the sorting of Contacts when viewing an Account is a little strange. When you click the next arrow to cycle through the Contacts, not every Contact is new. I backed down the number of times that I added the “ORDER BY id” to just one and my Global Search is still fast. Here is the query:
$newSQL = "DECLARE @topcount INT SET @topcount=$count SELECT TOP (@topcount) * FROM
" . $matches . $selectPart . ", ROW_NUMBER()
OVER (ORDER BY " . $this->returnOrderBy($sql, $orderByMatch) . ") AS row_number
" . $selectPart . $orderByMatch. "
) AS a
WHERE row_number > $start ORDER BY id";
I’ve undone this change for the moment. It messed up too many other things like sorting, report generation and not showing email addresses on lists of Contacts. Too bad because everyone was really excited about the new speed of Global Search.