스트링객체 성능 올리는법

2008. 5. 14. 11:00 | Posted by WiseBell

StringBuilder

Any "tips and tricks" presentation on the .NET Base Class Libraries will tell you that StringBuilder is better than String::Concat. Here's how that looks:

void StringBuild::build(int loops)
{
    String* block = "123456789012345678901234567890";
    StringBuilder* result = new StringBuilder();

    for (int i=0; i<loops;i++)
        result->Append(block);
    String* realresult = result->ToString();
    Console::WriteLine(realresult->Length.ToString());
    return;
}

The only hassle is remembering to call ToString() on the StringBuilder when you've finished building it. The code is simple to write and to read, and it's much much faster than the String::Concat() case. But it's not the fastest, at least not always.

Hand-allocating a buffer

StringBuilder works by allocating more memory than you need, and by tacking the new strings into that buffer. Every time you need the buffer to enlarge, it doubles in size. That approach was chosen as a trade-off between allocating too much memory and wasting a lot of time allocating little extra bits over and over again. The starting size is implementation-specific, but in most cases it's 16 characters. You can pass an integer to the StringBuilder constructor to bump up the initial allocation if you know what you'll need: That will save you the extra allocations but it won't save you all the testing to see whether you've exceeded your capacity or not.

So, because this is C++ after all, let's do something they can't do in VB and play with pointers a little. Look at this code:

void StringBuffer::build(int loops)
{
    char* block = "123456789012345678901234567890";
    int delta = strlen(block);
    char* buffer = new char[loops*strlen(block)+1];

    char* p = buffer;
    for (int i=0; i<loops;i++)
    {
        strcpy(p,block);
        p += delta; 
    }

    String* result = buffer;
    Console::WriteLine(result->Length.ToString());
    return;
}

Whenever you work with char* strings, you have to remember when to add an extra character for the \0 or when to move past it or before it or whatever. In this code, the delta, how much we move forward each time, is deliberately set to exactly the strlen of block—normally you would add 1 to allow room for the \0, but I want the first character of the next append to overwrite the \0 so we have one long, contiguous string at the end.

This code works, and it is faster than StringBuilder. That's not surprising, because I don't have to test to see whether I am exceeding my capacity, and I don't have to allocate more memory. Because I'm steering clear of the managed heap except for the final string, I'm probably not exercising the garbage collector either. So what is surprising is that's it's not very much faster than StringBuilder: about 10-20% less time for the same number of loops. And this is for a special case where I knew the exact length of the buffer in the end. For the general case where you're gluing together an unknown number of strings, each of an unknown length, you're not going to beat StringBuilder with something you write yourself. That's worth knowing, isn't it?

'Study' 카테고리의 다른 글

이상형을 찾아서~  (6) 2009.12.16
NUnit  (0) 2009.09.09
Call Level Interface (CLI)  (2) 2009.04.15
Visual C# .NET 에서 Word 문서 생성  (0) 2008.07.25
C#에서 동일한 프로그램에 중복 실행을 방지하는 방법  (2) 2008.01.31