Why is this code 6.5x slower with optimizations enabled?Unit Testing C CodeWith arrays, why is it the case that a[5] == 5[a]?Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?Why are elementwise additions much faster in separate loops than in a combined loop?What is “:-!!” in C code?Why is my program slow when looping over exactly 8192 elements?Obfuscated C Code Contest 2006. Please explain sykes2.cWhy does the C preprocessor interpret the word “linux” as the constant “1”?Why does GCC generate 15-20% faster code if I optimize for size instead of speed?How is the linking done for string functions in C?

What is the command to reset a PC without deleting any files

Where to refill my bottle in India?

What makes Graph invariants so useful/important?

Why don't electron-positron collisions release infinite energy?

A newer friend of my brother's gave him a load of baseball cards that are supposedly extremely valuable. Is this a scam?

How to use Pandas to get the count of every combination inclusive

Is Social Media Science Fiction?

Can you lasso down a wizard who is using the Levitate spell?

Need help identifying/translating a plaque in Tangier, Morocco

Can I interfere when another PC is about to be attacked?

Why has Russell's definition of numbers using equivalence classes been finally abandoned? ( If it has actually been abandoned).

My colleague's body is amazing

Download, install and reboot computer at night if needed

A function which translates a sentence to title-case

Can I make popcorn with any corn?

How do I create uniquely male characters?

Chess with symmetric move-square

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

What is the white spray-pattern residue inside these Falcon Heavy nozzles?

What is GPS' 19 year rollover and does it present a cybersecurity issue?

How can bays and straits be determined in a procedurally generated map?

Example of a relative pronoun

Do airline pilots ever risk not hearing communication directed to them specifically, from traffic controllers?

Why is the design of haulage companies so “special”?



Why is this code 6.5x slower with optimizations enabled?


Unit Testing C CodeWith arrays, why is it the case that a[5] == 5[a]?Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?Why are elementwise additions much faster in separate loops than in a combined loop?What is “:-!!” in C code?Why is my program slow when looping over exactly 8192 elements?Obfuscated C Code Contest 2006. Please explain sykes2.cWhy does the C preprocessor interpret the word “linux” as the constant “1”?Why does GCC generate 15-20% faster code if I optimize for size instead of speed?How is the linking done for string functions in C?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








23















I wanted to benchmark glibc's strlen function for some reason and found out it apparently performs much slower with optimizations enabled in GCC and I have no idea why.



Here's my code:



#include <time.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main()
char *s = calloc(1 << 20, 1);
memset(s, 65, 1000000);
clock_t start = clock();
for (int i = 0; i < 128; ++i)
s[strlen(s)] = 'A';

clock_t end = clock();
printf("%lldn", (long long)(end-start));
return 0;



On my machine it outputs:



$ gcc test.c && ./a.out
13336
$ gcc -O1 test.c && ./a.out
199004
$ gcc -O2 test.c && ./a.out
83415
$ gcc -O3 test.c && ./a.out
83415


Somehow, enabling optimizations causes it to execute longer.










share|improve this question
























  • With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

    – Maxim Egorushkin
    9 hours ago







  • 1





    Please report it to gcc's bugzilla.

    – Marc Glisse
    9 hours ago






  • 1





    Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

    – David Schwartz
    9 hours ago











  • It is generating repnz scasb for strlen at -O1.

    – Marc Glisse
    9 hours ago







  • 1





    @MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

    – Justin
    3 hours ago

















23















I wanted to benchmark glibc's strlen function for some reason and found out it apparently performs much slower with optimizations enabled in GCC and I have no idea why.



Here's my code:



#include <time.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main()
char *s = calloc(1 << 20, 1);
memset(s, 65, 1000000);
clock_t start = clock();
for (int i = 0; i < 128; ++i)
s[strlen(s)] = 'A';

clock_t end = clock();
printf("%lldn", (long long)(end-start));
return 0;



On my machine it outputs:



$ gcc test.c && ./a.out
13336
$ gcc -O1 test.c && ./a.out
199004
$ gcc -O2 test.c && ./a.out
83415
$ gcc -O3 test.c && ./a.out
83415


Somehow, enabling optimizations causes it to execute longer.










share|improve this question
























  • With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

    – Maxim Egorushkin
    9 hours ago







  • 1





    Please report it to gcc's bugzilla.

    – Marc Glisse
    9 hours ago






  • 1





    Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

    – David Schwartz
    9 hours ago











  • It is generating repnz scasb for strlen at -O1.

    – Marc Glisse
    9 hours ago







  • 1





    @MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

    – Justin
    3 hours ago













23












23








23


1






I wanted to benchmark glibc's strlen function for some reason and found out it apparently performs much slower with optimizations enabled in GCC and I have no idea why.



Here's my code:



#include <time.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main()
char *s = calloc(1 << 20, 1);
memset(s, 65, 1000000);
clock_t start = clock();
for (int i = 0; i < 128; ++i)
s[strlen(s)] = 'A';

clock_t end = clock();
printf("%lldn", (long long)(end-start));
return 0;



On my machine it outputs:



$ gcc test.c && ./a.out
13336
$ gcc -O1 test.c && ./a.out
199004
$ gcc -O2 test.c && ./a.out
83415
$ gcc -O3 test.c && ./a.out
83415


Somehow, enabling optimizations causes it to execute longer.










share|improve this question
















I wanted to benchmark glibc's strlen function for some reason and found out it apparently performs much slower with optimizations enabled in GCC and I have no idea why.



Here's my code:



#include <time.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main()
char *s = calloc(1 << 20, 1);
memset(s, 65, 1000000);
clock_t start = clock();
for (int i = 0; i < 128; ++i)
s[strlen(s)] = 'A';

clock_t end = clock();
printf("%lldn", (long long)(end-start));
return 0;



On my machine it outputs:



$ gcc test.c && ./a.out
13336
$ gcc -O1 test.c && ./a.out
199004
$ gcc -O2 test.c && ./a.out
83415
$ gcc -O3 test.c && ./a.out
83415


Somehow, enabling optimizations causes it to execute longer.







c performance gcc glibc






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 9 mins ago









Muntasir

6301919




6301919










asked 9 hours ago









TsarNTsarN

11817




11817












  • With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

    – Maxim Egorushkin
    9 hours ago







  • 1





    Please report it to gcc's bugzilla.

    – Marc Glisse
    9 hours ago






  • 1





    Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

    – David Schwartz
    9 hours ago











  • It is generating repnz scasb for strlen at -O1.

    – Marc Glisse
    9 hours ago







  • 1





    @MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

    – Justin
    3 hours ago

















  • With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

    – Maxim Egorushkin
    9 hours ago







  • 1





    Please report it to gcc's bugzilla.

    – Marc Glisse
    9 hours ago






  • 1





    Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

    – David Schwartz
    9 hours ago











  • It is generating repnz scasb for strlen at -O1.

    – Marc Glisse
    9 hours ago







  • 1





    @MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

    – Justin
    3 hours ago
















With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

– Maxim Egorushkin
9 hours ago






With gcc-8.2 debug version takes 51334, release 8246. Release compiler options -O3 -march=native -DNDEBUG

– Maxim Egorushkin
9 hours ago





1




1





Please report it to gcc's bugzilla.

– Marc Glisse
9 hours ago





Please report it to gcc's bugzilla.

– Marc Glisse
9 hours ago




1




1





Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

– David Schwartz
9 hours ago





Using -fno-builtin makes the problem go away. So presumably the issue is that in this particular instance, GCC's builtin strlen is slower than the library's.

– David Schwartz
9 hours ago













It is generating repnz scasb for strlen at -O1.

– Marc Glisse
9 hours ago






It is generating repnz scasb for strlen at -O1.

– Marc Glisse
9 hours ago





1




1





@MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

– Justin
3 hours ago





@MarcGlisse It's already been filed: gcc.gnu.org/bugzilla/show_bug.cgi?id=88809

– Justin
3 hours ago












1 Answer
1






active

oldest

votes


















22














Testing your code on Godbolt's Compiler Explorer provides this explanation:



  • at -O0 or without optimisations, the generated code call the C library function strlen

  • at -O1 the generated code uses a simple inline expansion using a rep scasb instruction.

  • at -O2 and above, the generated code uses a more elaborate inline expansion.

Benchmarking your code repeatedly shows a substantial variation from one run to another, but increasing the number of iterations shows that:



  • the -O1 code is much slower than the C library implementation: 32240 vs 3090

  • the -O2 code is faster than the -O1 but still substantially slower than the C ibrary code: 8570 vs 3090.

This behavior is specific to gcc and the glibc. The same test on OS/X with clang and Apple's Libc does not show a significant difference, which is not a surprise as Godbolt shows that clang generates a call to the C library strlen at all optimisation levels.



This could be considered a bug in gcc/glibc but more extensive benchmarking might show that the overhead of calling strlen has a more important impact than the lack of performance of the inline code for small strings. The strings on which you benchmark are uncommonly large, so focusing the benchmark on ultra-long strings might not give meaningful results.



I updated the benchmark for smaller strings and it shows similar performance for string lengths varying from 0 to 100 at -O0 and -O2 but still a much worse performance at -O1, 3 times slower.



Here is the updated code:



#include <stdlib.h>
#include <string.h>
#include <time.h>

void benchmark(int repeat, int minlen, int maxlen)
char *s = malloc(maxlen + 1);
memset(s, 'A', minlen);
long long bytes = 0, calls = 0;
clock_t clk = clock();
for (int n = 0; n < repeat; n++)
for (int i = minlen; i < maxlen; ++i)
bytes += i + 1;
calls += 1;
s[i] = '';
s[strlen(s)] = 'A';


clk = clock() - clk;
free(s);
double avglen = (minlen + maxlen - 1) / 2.0;
double ns = (double)clk * 1e9 / CLOCKS_PER_SEC;
printf("average length %7.0f -> avg time: %7.3f ns/byte, %7.3f ns/calln",
avglen, ns / bytes, ns / calls);


int main()
benchmark(10000000, 0, 1);
benchmark(1000000, 0, 10);
benchmark(1000000, 5, 15);
benchmark(100000, 0, 100);
benchmark(100000, 50, 150);
benchmark(10000, 0, 1000);
benchmark(10000, 500, 1500);
benchmark(1000, 0, 10000);
benchmark(1000, 5000, 15000);
benchmark(100, 1000000 - 50, 1000000 + 50);
return 0;



Here is the output:




chqrlie> gcc -std=c99 -O0 benchstrlen.c && ./a.out
average length 0 -> avg time: 14.000 ns/byte, 14.000 ns/call
average length 4 -> avg time: 2.364 ns/byte, 13.000 ns/call
average length 10 -> avg time: 1.238 ns/byte, 13.000 ns/call
average length 50 -> avg time: 0.317 ns/byte, 16.000 ns/call
average length 100 -> avg time: 0.169 ns/byte, 17.000 ns/call
average length 500 -> avg time: 0.074 ns/byte, 37.000 ns/call
average length 1000 -> avg time: 0.068 ns/byte, 68.000 ns/call
average length 5000 -> avg time: 0.064 ns/byte, 318.000 ns/call
average length 10000 -> avg time: 0.062 ns/byte, 622.000 ns/call
average length 1000000 -> avg time: 0.062 ns/byte, 62000.000 ns/call
chqrlie> gcc -std=c99 -O1 benchstrlen.c && ./a.out
average length 0 -> avg time: 20.000 ns/byte, 20.000 ns/call
average length 4 -> avg time: 3.818 ns/byte, 21.000 ns/call
average length 10 -> avg time: 2.190 ns/byte, 23.000 ns/call
average length 50 -> avg time: 0.990 ns/byte, 50.000 ns/call
average length 100 -> avg time: 0.816 ns/byte, 82.000 ns/call
average length 500 -> avg time: 0.679 ns/byte, 340.000 ns/call
average length 1000 -> avg time: 0.664 ns/byte, 664.000 ns/call
average length 5000 -> avg time: 0.651 ns/byte, 3254.000 ns/call
average length 10000 -> avg time: 0.649 ns/byte, 6491.000 ns/call
average length 1000000 -> avg time: 0.648 ns/byte, 648000.000 ns/call
chqrlie> gcc -std=c99 -O2 benchstrlen.c && ./a.out
average length 0 -> avg time: 10.000 ns/byte, 10.000 ns/call
average length 4 -> avg time: 2.000 ns/byte, 11.000 ns/call
average length 10 -> avg time: 1.048 ns/byte, 11.000 ns/call
average length 50 -> avg time: 0.337 ns/byte, 17.000 ns/call
average length 100 -> avg time: 0.299 ns/byte, 30.000 ns/call
average length 500 -> avg time: 0.202 ns/byte, 101.000 ns/call
average length 1000 -> avg time: 0.188 ns/byte, 188.000 ns/call
average length 5000 -> avg time: 0.174 ns/byte, 868.000 ns/call
average length 10000 -> avg time: 0.172 ns/byte, 1716.000 ns/call
average length 1000000 -> avg time: 0.172 ns/byte, 172000.000 ns/call





share|improve this answer

























  • Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

    – Daniel H
    8 hours ago






  • 1





    It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

    – chqrlie
    8 hours ago











  • Does it change if you use -march=native -mtune=native?

    – Deduplicator
    7 hours ago











  • Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

    – Brendan
    7 hours ago







  • 1





    @chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

    – Brendan
    6 hours ago











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55563598%2fwhy-is-this-code-6-5x-slower-with-optimizations-enabled%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









22














Testing your code on Godbolt's Compiler Explorer provides this explanation:



  • at -O0 or without optimisations, the generated code call the C library function strlen

  • at -O1 the generated code uses a simple inline expansion using a rep scasb instruction.

  • at -O2 and above, the generated code uses a more elaborate inline expansion.

Benchmarking your code repeatedly shows a substantial variation from one run to another, but increasing the number of iterations shows that:



  • the -O1 code is much slower than the C library implementation: 32240 vs 3090

  • the -O2 code is faster than the -O1 but still substantially slower than the C ibrary code: 8570 vs 3090.

This behavior is specific to gcc and the glibc. The same test on OS/X with clang and Apple's Libc does not show a significant difference, which is not a surprise as Godbolt shows that clang generates a call to the C library strlen at all optimisation levels.



This could be considered a bug in gcc/glibc but more extensive benchmarking might show that the overhead of calling strlen has a more important impact than the lack of performance of the inline code for small strings. The strings on which you benchmark are uncommonly large, so focusing the benchmark on ultra-long strings might not give meaningful results.



I updated the benchmark for smaller strings and it shows similar performance for string lengths varying from 0 to 100 at -O0 and -O2 but still a much worse performance at -O1, 3 times slower.



Here is the updated code:



#include <stdlib.h>
#include <string.h>
#include <time.h>

void benchmark(int repeat, int minlen, int maxlen)
char *s = malloc(maxlen + 1);
memset(s, 'A', minlen);
long long bytes = 0, calls = 0;
clock_t clk = clock();
for (int n = 0; n < repeat; n++)
for (int i = minlen; i < maxlen; ++i)
bytes += i + 1;
calls += 1;
s[i] = '';
s[strlen(s)] = 'A';


clk = clock() - clk;
free(s);
double avglen = (minlen + maxlen - 1) / 2.0;
double ns = (double)clk * 1e9 / CLOCKS_PER_SEC;
printf("average length %7.0f -> avg time: %7.3f ns/byte, %7.3f ns/calln",
avglen, ns / bytes, ns / calls);


int main()
benchmark(10000000, 0, 1);
benchmark(1000000, 0, 10);
benchmark(1000000, 5, 15);
benchmark(100000, 0, 100);
benchmark(100000, 50, 150);
benchmark(10000, 0, 1000);
benchmark(10000, 500, 1500);
benchmark(1000, 0, 10000);
benchmark(1000, 5000, 15000);
benchmark(100, 1000000 - 50, 1000000 + 50);
return 0;



Here is the output:




chqrlie> gcc -std=c99 -O0 benchstrlen.c && ./a.out
average length 0 -> avg time: 14.000 ns/byte, 14.000 ns/call
average length 4 -> avg time: 2.364 ns/byte, 13.000 ns/call
average length 10 -> avg time: 1.238 ns/byte, 13.000 ns/call
average length 50 -> avg time: 0.317 ns/byte, 16.000 ns/call
average length 100 -> avg time: 0.169 ns/byte, 17.000 ns/call
average length 500 -> avg time: 0.074 ns/byte, 37.000 ns/call
average length 1000 -> avg time: 0.068 ns/byte, 68.000 ns/call
average length 5000 -> avg time: 0.064 ns/byte, 318.000 ns/call
average length 10000 -> avg time: 0.062 ns/byte, 622.000 ns/call
average length 1000000 -> avg time: 0.062 ns/byte, 62000.000 ns/call
chqrlie> gcc -std=c99 -O1 benchstrlen.c && ./a.out
average length 0 -> avg time: 20.000 ns/byte, 20.000 ns/call
average length 4 -> avg time: 3.818 ns/byte, 21.000 ns/call
average length 10 -> avg time: 2.190 ns/byte, 23.000 ns/call
average length 50 -> avg time: 0.990 ns/byte, 50.000 ns/call
average length 100 -> avg time: 0.816 ns/byte, 82.000 ns/call
average length 500 -> avg time: 0.679 ns/byte, 340.000 ns/call
average length 1000 -> avg time: 0.664 ns/byte, 664.000 ns/call
average length 5000 -> avg time: 0.651 ns/byte, 3254.000 ns/call
average length 10000 -> avg time: 0.649 ns/byte, 6491.000 ns/call
average length 1000000 -> avg time: 0.648 ns/byte, 648000.000 ns/call
chqrlie> gcc -std=c99 -O2 benchstrlen.c && ./a.out
average length 0 -> avg time: 10.000 ns/byte, 10.000 ns/call
average length 4 -> avg time: 2.000 ns/byte, 11.000 ns/call
average length 10 -> avg time: 1.048 ns/byte, 11.000 ns/call
average length 50 -> avg time: 0.337 ns/byte, 17.000 ns/call
average length 100 -> avg time: 0.299 ns/byte, 30.000 ns/call
average length 500 -> avg time: 0.202 ns/byte, 101.000 ns/call
average length 1000 -> avg time: 0.188 ns/byte, 188.000 ns/call
average length 5000 -> avg time: 0.174 ns/byte, 868.000 ns/call
average length 10000 -> avg time: 0.172 ns/byte, 1716.000 ns/call
average length 1000000 -> avg time: 0.172 ns/byte, 172000.000 ns/call





share|improve this answer

























  • Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

    – Daniel H
    8 hours ago






  • 1





    It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

    – chqrlie
    8 hours ago











  • Does it change if you use -march=native -mtune=native?

    – Deduplicator
    7 hours ago











  • Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

    – Brendan
    7 hours ago







  • 1





    @chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

    – Brendan
    6 hours ago















22














Testing your code on Godbolt's Compiler Explorer provides this explanation:



  • at -O0 or without optimisations, the generated code call the C library function strlen

  • at -O1 the generated code uses a simple inline expansion using a rep scasb instruction.

  • at -O2 and above, the generated code uses a more elaborate inline expansion.

Benchmarking your code repeatedly shows a substantial variation from one run to another, but increasing the number of iterations shows that:



  • the -O1 code is much slower than the C library implementation: 32240 vs 3090

  • the -O2 code is faster than the -O1 but still substantially slower than the C ibrary code: 8570 vs 3090.

This behavior is specific to gcc and the glibc. The same test on OS/X with clang and Apple's Libc does not show a significant difference, which is not a surprise as Godbolt shows that clang generates a call to the C library strlen at all optimisation levels.



This could be considered a bug in gcc/glibc but more extensive benchmarking might show that the overhead of calling strlen has a more important impact than the lack of performance of the inline code for small strings. The strings on which you benchmark are uncommonly large, so focusing the benchmark on ultra-long strings might not give meaningful results.



I updated the benchmark for smaller strings and it shows similar performance for string lengths varying from 0 to 100 at -O0 and -O2 but still a much worse performance at -O1, 3 times slower.



Here is the updated code:



#include <stdlib.h>
#include <string.h>
#include <time.h>

void benchmark(int repeat, int minlen, int maxlen)
char *s = malloc(maxlen + 1);
memset(s, 'A', minlen);
long long bytes = 0, calls = 0;
clock_t clk = clock();
for (int n = 0; n < repeat; n++)
for (int i = minlen; i < maxlen; ++i)
bytes += i + 1;
calls += 1;
s[i] = '';
s[strlen(s)] = 'A';


clk = clock() - clk;
free(s);
double avglen = (minlen + maxlen - 1) / 2.0;
double ns = (double)clk * 1e9 / CLOCKS_PER_SEC;
printf("average length %7.0f -> avg time: %7.3f ns/byte, %7.3f ns/calln",
avglen, ns / bytes, ns / calls);


int main()
benchmark(10000000, 0, 1);
benchmark(1000000, 0, 10);
benchmark(1000000, 5, 15);
benchmark(100000, 0, 100);
benchmark(100000, 50, 150);
benchmark(10000, 0, 1000);
benchmark(10000, 500, 1500);
benchmark(1000, 0, 10000);
benchmark(1000, 5000, 15000);
benchmark(100, 1000000 - 50, 1000000 + 50);
return 0;



Here is the output:




chqrlie> gcc -std=c99 -O0 benchstrlen.c && ./a.out
average length 0 -> avg time: 14.000 ns/byte, 14.000 ns/call
average length 4 -> avg time: 2.364 ns/byte, 13.000 ns/call
average length 10 -> avg time: 1.238 ns/byte, 13.000 ns/call
average length 50 -> avg time: 0.317 ns/byte, 16.000 ns/call
average length 100 -> avg time: 0.169 ns/byte, 17.000 ns/call
average length 500 -> avg time: 0.074 ns/byte, 37.000 ns/call
average length 1000 -> avg time: 0.068 ns/byte, 68.000 ns/call
average length 5000 -> avg time: 0.064 ns/byte, 318.000 ns/call
average length 10000 -> avg time: 0.062 ns/byte, 622.000 ns/call
average length 1000000 -> avg time: 0.062 ns/byte, 62000.000 ns/call
chqrlie> gcc -std=c99 -O1 benchstrlen.c && ./a.out
average length 0 -> avg time: 20.000 ns/byte, 20.000 ns/call
average length 4 -> avg time: 3.818 ns/byte, 21.000 ns/call
average length 10 -> avg time: 2.190 ns/byte, 23.000 ns/call
average length 50 -> avg time: 0.990 ns/byte, 50.000 ns/call
average length 100 -> avg time: 0.816 ns/byte, 82.000 ns/call
average length 500 -> avg time: 0.679 ns/byte, 340.000 ns/call
average length 1000 -> avg time: 0.664 ns/byte, 664.000 ns/call
average length 5000 -> avg time: 0.651 ns/byte, 3254.000 ns/call
average length 10000 -> avg time: 0.649 ns/byte, 6491.000 ns/call
average length 1000000 -> avg time: 0.648 ns/byte, 648000.000 ns/call
chqrlie> gcc -std=c99 -O2 benchstrlen.c && ./a.out
average length 0 -> avg time: 10.000 ns/byte, 10.000 ns/call
average length 4 -> avg time: 2.000 ns/byte, 11.000 ns/call
average length 10 -> avg time: 1.048 ns/byte, 11.000 ns/call
average length 50 -> avg time: 0.337 ns/byte, 17.000 ns/call
average length 100 -> avg time: 0.299 ns/byte, 30.000 ns/call
average length 500 -> avg time: 0.202 ns/byte, 101.000 ns/call
average length 1000 -> avg time: 0.188 ns/byte, 188.000 ns/call
average length 5000 -> avg time: 0.174 ns/byte, 868.000 ns/call
average length 10000 -> avg time: 0.172 ns/byte, 1716.000 ns/call
average length 1000000 -> avg time: 0.172 ns/byte, 172000.000 ns/call





share|improve this answer

























  • Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

    – Daniel H
    8 hours ago






  • 1





    It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

    – chqrlie
    8 hours ago











  • Does it change if you use -march=native -mtune=native?

    – Deduplicator
    7 hours ago











  • Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

    – Brendan
    7 hours ago







  • 1





    @chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

    – Brendan
    6 hours ago













22












22








22







Testing your code on Godbolt's Compiler Explorer provides this explanation:



  • at -O0 or without optimisations, the generated code call the C library function strlen

  • at -O1 the generated code uses a simple inline expansion using a rep scasb instruction.

  • at -O2 and above, the generated code uses a more elaborate inline expansion.

Benchmarking your code repeatedly shows a substantial variation from one run to another, but increasing the number of iterations shows that:



  • the -O1 code is much slower than the C library implementation: 32240 vs 3090

  • the -O2 code is faster than the -O1 but still substantially slower than the C ibrary code: 8570 vs 3090.

This behavior is specific to gcc and the glibc. The same test on OS/X with clang and Apple's Libc does not show a significant difference, which is not a surprise as Godbolt shows that clang generates a call to the C library strlen at all optimisation levels.



This could be considered a bug in gcc/glibc but more extensive benchmarking might show that the overhead of calling strlen has a more important impact than the lack of performance of the inline code for small strings. The strings on which you benchmark are uncommonly large, so focusing the benchmark on ultra-long strings might not give meaningful results.



I updated the benchmark for smaller strings and it shows similar performance for string lengths varying from 0 to 100 at -O0 and -O2 but still a much worse performance at -O1, 3 times slower.



Here is the updated code:



#include <stdlib.h>
#include <string.h>
#include <time.h>

void benchmark(int repeat, int minlen, int maxlen)
char *s = malloc(maxlen + 1);
memset(s, 'A', minlen);
long long bytes = 0, calls = 0;
clock_t clk = clock();
for (int n = 0; n < repeat; n++)
for (int i = minlen; i < maxlen; ++i)
bytes += i + 1;
calls += 1;
s[i] = '';
s[strlen(s)] = 'A';


clk = clock() - clk;
free(s);
double avglen = (minlen + maxlen - 1) / 2.0;
double ns = (double)clk * 1e9 / CLOCKS_PER_SEC;
printf("average length %7.0f -> avg time: %7.3f ns/byte, %7.3f ns/calln",
avglen, ns / bytes, ns / calls);


int main()
benchmark(10000000, 0, 1);
benchmark(1000000, 0, 10);
benchmark(1000000, 5, 15);
benchmark(100000, 0, 100);
benchmark(100000, 50, 150);
benchmark(10000, 0, 1000);
benchmark(10000, 500, 1500);
benchmark(1000, 0, 10000);
benchmark(1000, 5000, 15000);
benchmark(100, 1000000 - 50, 1000000 + 50);
return 0;



Here is the output:




chqrlie> gcc -std=c99 -O0 benchstrlen.c && ./a.out
average length 0 -> avg time: 14.000 ns/byte, 14.000 ns/call
average length 4 -> avg time: 2.364 ns/byte, 13.000 ns/call
average length 10 -> avg time: 1.238 ns/byte, 13.000 ns/call
average length 50 -> avg time: 0.317 ns/byte, 16.000 ns/call
average length 100 -> avg time: 0.169 ns/byte, 17.000 ns/call
average length 500 -> avg time: 0.074 ns/byte, 37.000 ns/call
average length 1000 -> avg time: 0.068 ns/byte, 68.000 ns/call
average length 5000 -> avg time: 0.064 ns/byte, 318.000 ns/call
average length 10000 -> avg time: 0.062 ns/byte, 622.000 ns/call
average length 1000000 -> avg time: 0.062 ns/byte, 62000.000 ns/call
chqrlie> gcc -std=c99 -O1 benchstrlen.c && ./a.out
average length 0 -> avg time: 20.000 ns/byte, 20.000 ns/call
average length 4 -> avg time: 3.818 ns/byte, 21.000 ns/call
average length 10 -> avg time: 2.190 ns/byte, 23.000 ns/call
average length 50 -> avg time: 0.990 ns/byte, 50.000 ns/call
average length 100 -> avg time: 0.816 ns/byte, 82.000 ns/call
average length 500 -> avg time: 0.679 ns/byte, 340.000 ns/call
average length 1000 -> avg time: 0.664 ns/byte, 664.000 ns/call
average length 5000 -> avg time: 0.651 ns/byte, 3254.000 ns/call
average length 10000 -> avg time: 0.649 ns/byte, 6491.000 ns/call
average length 1000000 -> avg time: 0.648 ns/byte, 648000.000 ns/call
chqrlie> gcc -std=c99 -O2 benchstrlen.c && ./a.out
average length 0 -> avg time: 10.000 ns/byte, 10.000 ns/call
average length 4 -> avg time: 2.000 ns/byte, 11.000 ns/call
average length 10 -> avg time: 1.048 ns/byte, 11.000 ns/call
average length 50 -> avg time: 0.337 ns/byte, 17.000 ns/call
average length 100 -> avg time: 0.299 ns/byte, 30.000 ns/call
average length 500 -> avg time: 0.202 ns/byte, 101.000 ns/call
average length 1000 -> avg time: 0.188 ns/byte, 188.000 ns/call
average length 5000 -> avg time: 0.174 ns/byte, 868.000 ns/call
average length 10000 -> avg time: 0.172 ns/byte, 1716.000 ns/call
average length 1000000 -> avg time: 0.172 ns/byte, 172000.000 ns/call





share|improve this answer















Testing your code on Godbolt's Compiler Explorer provides this explanation:



  • at -O0 or without optimisations, the generated code call the C library function strlen

  • at -O1 the generated code uses a simple inline expansion using a rep scasb instruction.

  • at -O2 and above, the generated code uses a more elaborate inline expansion.

Benchmarking your code repeatedly shows a substantial variation from one run to another, but increasing the number of iterations shows that:



  • the -O1 code is much slower than the C library implementation: 32240 vs 3090

  • the -O2 code is faster than the -O1 but still substantially slower than the C ibrary code: 8570 vs 3090.

This behavior is specific to gcc and the glibc. The same test on OS/X with clang and Apple's Libc does not show a significant difference, which is not a surprise as Godbolt shows that clang generates a call to the C library strlen at all optimisation levels.



This could be considered a bug in gcc/glibc but more extensive benchmarking might show that the overhead of calling strlen has a more important impact than the lack of performance of the inline code for small strings. The strings on which you benchmark are uncommonly large, so focusing the benchmark on ultra-long strings might not give meaningful results.



I updated the benchmark for smaller strings and it shows similar performance for string lengths varying from 0 to 100 at -O0 and -O2 but still a much worse performance at -O1, 3 times slower.



Here is the updated code:



#include <stdlib.h>
#include <string.h>
#include <time.h>

void benchmark(int repeat, int minlen, int maxlen)
char *s = malloc(maxlen + 1);
memset(s, 'A', minlen);
long long bytes = 0, calls = 0;
clock_t clk = clock();
for (int n = 0; n < repeat; n++)
for (int i = minlen; i < maxlen; ++i)
bytes += i + 1;
calls += 1;
s[i] = '';
s[strlen(s)] = 'A';


clk = clock() - clk;
free(s);
double avglen = (minlen + maxlen - 1) / 2.0;
double ns = (double)clk * 1e9 / CLOCKS_PER_SEC;
printf("average length %7.0f -> avg time: %7.3f ns/byte, %7.3f ns/calln",
avglen, ns / bytes, ns / calls);


int main()
benchmark(10000000, 0, 1);
benchmark(1000000, 0, 10);
benchmark(1000000, 5, 15);
benchmark(100000, 0, 100);
benchmark(100000, 50, 150);
benchmark(10000, 0, 1000);
benchmark(10000, 500, 1500);
benchmark(1000, 0, 10000);
benchmark(1000, 5000, 15000);
benchmark(100, 1000000 - 50, 1000000 + 50);
return 0;



Here is the output:




chqrlie> gcc -std=c99 -O0 benchstrlen.c && ./a.out
average length 0 -> avg time: 14.000 ns/byte, 14.000 ns/call
average length 4 -> avg time: 2.364 ns/byte, 13.000 ns/call
average length 10 -> avg time: 1.238 ns/byte, 13.000 ns/call
average length 50 -> avg time: 0.317 ns/byte, 16.000 ns/call
average length 100 -> avg time: 0.169 ns/byte, 17.000 ns/call
average length 500 -> avg time: 0.074 ns/byte, 37.000 ns/call
average length 1000 -> avg time: 0.068 ns/byte, 68.000 ns/call
average length 5000 -> avg time: 0.064 ns/byte, 318.000 ns/call
average length 10000 -> avg time: 0.062 ns/byte, 622.000 ns/call
average length 1000000 -> avg time: 0.062 ns/byte, 62000.000 ns/call
chqrlie> gcc -std=c99 -O1 benchstrlen.c && ./a.out
average length 0 -> avg time: 20.000 ns/byte, 20.000 ns/call
average length 4 -> avg time: 3.818 ns/byte, 21.000 ns/call
average length 10 -> avg time: 2.190 ns/byte, 23.000 ns/call
average length 50 -> avg time: 0.990 ns/byte, 50.000 ns/call
average length 100 -> avg time: 0.816 ns/byte, 82.000 ns/call
average length 500 -> avg time: 0.679 ns/byte, 340.000 ns/call
average length 1000 -> avg time: 0.664 ns/byte, 664.000 ns/call
average length 5000 -> avg time: 0.651 ns/byte, 3254.000 ns/call
average length 10000 -> avg time: 0.649 ns/byte, 6491.000 ns/call
average length 1000000 -> avg time: 0.648 ns/byte, 648000.000 ns/call
chqrlie> gcc -std=c99 -O2 benchstrlen.c && ./a.out
average length 0 -> avg time: 10.000 ns/byte, 10.000 ns/call
average length 4 -> avg time: 2.000 ns/byte, 11.000 ns/call
average length 10 -> avg time: 1.048 ns/byte, 11.000 ns/call
average length 50 -> avg time: 0.337 ns/byte, 17.000 ns/call
average length 100 -> avg time: 0.299 ns/byte, 30.000 ns/call
average length 500 -> avg time: 0.202 ns/byte, 101.000 ns/call
average length 1000 -> avg time: 0.188 ns/byte, 188.000 ns/call
average length 5000 -> avg time: 0.174 ns/byte, 868.000 ns/call
average length 10000 -> avg time: 0.172 ns/byte, 1716.000 ns/call
average length 1000000 -> avg time: 0.172 ns/byte, 172000.000 ns/call






share|improve this answer














share|improve this answer



share|improve this answer








edited 7 hours ago

























answered 9 hours ago









chqrliechqrlie

63.1k849108




63.1k849108












  • Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

    – Daniel H
    8 hours ago






  • 1





    It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

    – chqrlie
    8 hours ago











  • Does it change if you use -march=native -mtune=native?

    – Deduplicator
    7 hours ago











  • Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

    – Brendan
    7 hours ago







  • 1





    @chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

    – Brendan
    6 hours ago

















  • Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

    – Daniel H
    8 hours ago






  • 1





    It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

    – chqrlie
    8 hours ago











  • Does it change if you use -march=native -mtune=native?

    – Deduplicator
    7 hours ago











  • Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

    – Brendan
    7 hours ago







  • 1





    @chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

    – Brendan
    6 hours ago
















Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

– Daniel H
8 hours ago





Wouldn't it still be better for the inlined version to use the same optimizations as the library strlen, giving the best of both worlds?

– Daniel H
8 hours ago




1




1





It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

– chqrlie
8 hours ago





It would, but the hand optimized version in the C library might be larger and more complicated to inline. I have not looked into this recently, but there used to be a mix of complex platform specific macros in <string.h> and hard coded optimisations in the gcc code generator. Definitely still room for improvement on intel targets.

– chqrlie
8 hours ago













Does it change if you use -march=native -mtune=native?

– Deduplicator
7 hours ago





Does it change if you use -march=native -mtune=native?

– Deduplicator
7 hours ago













Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

– Brendan
7 hours ago






Note that the GNU C library function for strlen() is likely optimised for extremely large strings (that no sane programmer will care about) at the expense of performance for small strings (that are extremely common); and the optimisations done by the library version should never be done. The problem here is that the OP's code doesn't keep track of the string's length itself (e.g. with an int len; variable) and should not have used strlen() at all, making the code so bad for performance that "optimised for something no sane programmer would care about" actually helped.

– Brendan
7 hours ago





1




1





@chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

– Brendan
6 hours ago





@chqrlie: I'd also say that this is partly a symptom of a larger problem - code in libraries can't be optimised for any specific case and therefore must be "un-optimal" for some cases. To work around that it would've been nice if there was a strlen_small() and a separate strlen_large(), but there isn't.

– Brendan
6 hours ago



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55563598%2fwhy-is-this-code-6-5x-slower-with-optimizations-enabled%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Францішак Багушэвіч Змест Сям'я | Біяграфія | Творчасць | Мова Багушэвіча | Ацэнкі дзейнасці | Цікавыя факты | Спадчына | Выбраная бібліяграфія | Ушанаванне памяці | У філатэліі | Зноскі | Літаратура | Спасылкі | НавігацыяЛяхоўскі У. Рупіўся дзеля Бога і людзей: Жыццёвы шлях Лявона Вітан-Дубейкаўскага // Вольскі і Памідораў з песняй пра немца Адвакат, паэт, народны заступнік Ашмянскі веснікВ Минске появится площадь Богушевича и улица Сырокомли, Белорусская деловая газета, 19 июля 2001 г.Айцец беларускай нацыянальнай ідэі паўстаў у бронзе Сяргей Аляксандравіч Адашкевіч (1918, Мінск). 80-я гады. Бюст «Францішак Багушэвіч».Яўген Мікалаевіч Ціхановіч. «Партрэт Францішка Багушэвіча»Мікола Мікалаевіч Купава. «Партрэт зачынальніка новай беларускай літаратуры Францішка Багушэвіча»Уладзімір Іванавіч Мелехаў. На помніку «Змагарам за родную мову» Барэльеф «Францішак Багушэвіч»Памяць пра Багушэвіча на Віленшчыне Страчаная сталіца. Беларускія шыльды на вуліцах Вільні«Krynica». Ideologia i przywódcy białoruskiego katolicyzmuФранцішак БагушэвічТворы на knihi.comТворы Францішка Багушэвіча на bellib.byСодаль Уладзімір. Францішак Багушэвіч на Лідчыне;Луцкевіч Антон. Жыцьцё і творчасьць Фр. Багушэвіча ў успамінах ягоных сучасьнікаў // Запісы Беларускага Навуковага таварыства. Вільня, 1938. Сшытак 1. С. 16-34.Большая российская1188761710000 0000 5537 633Xn9209310021619551927869394п

На ростанях Змест Гісторыя напісання | Месца дзеяння | Час дзеяння | Назва | Праблематыка трылогіі | Аўтабіяграфічнасць | Трылогія ў тэатры і кіно | Пераклады | У культуры | Зноскі Літаратура | Спасылкі | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 зменаАкадэмік МІЦКЕВІЧ Канстанцін Міхайлавіч (Якуб Колас) Прадмова М. І. Мушынскага, доктара філалагічных навук, члена-карэспандэнта Нацыянальнай акадэміі навук Рэспублікі Беларусь, прафесараНашаніўцы ў трылогіі Якуба Коласа «На ростанях»: вобразы і прататыпы125 лет Янке МавруКнижно-документальная выставка к 125-летию со дня рождения Якуба Коласа (1882—1956)Колас Якуб. Новая зямля (паэма), На ростанях (трылогія). Сулкоўскі Уладзімір. Радзіма Якуба Коласа (серыял жывапісных палотнаў)Вокладка кнігіІлюстрацыя М. С. БасалыгіНа ростаняхАўдыёверсія трылогііВ. Жолтак У Люсiнскай школе 1959

Беларусь Змест Назва Гісторыя Геаграфія Сімволіка Дзяржаўны лад Палітычныя партыі Міжнароднае становішча і знешняя палітыка Адміністрацыйны падзел Насельніцтва Эканоміка Культура і грамадства Сацыяльная сфера Узброеныя сілы Заўвагі Літаратура Спасылкі НавігацыяHGЯOiТоп-2011 г. (па версіі ej.by)Топ-2013 г. (па версіі ej.by)Топ-2016 г. (па версіі ej.by)Топ-2017 г. (па версіі ej.by)Нацыянальны статыстычны камітэт Рэспублікі БеларусьШчыльнасць насельніцтва па краінахhttp://naviny.by/rubrics/society/2011/09/16/ic_articles_116_175144/А. Калечыц, У. Ксяндзоў. Спробы засялення краю неандэртальскім чалавекам.І ў Менску былі мамантыА. Калечыц, У. Ксяндзоў. Старажытны каменны век (палеаліт). Першапачатковае засяленне тэрыторыіГ. Штыхаў. Балты і славяне ў VI—VIII стст.М. Клімаў. Полацкае княства ў IX—XI стст.Г. Штыхаў, В. Ляўко. Палітычная гісторыя Полацкай зямліГ. Штыхаў. Дзяржаўны лад у землях-княствахГ. Штыхаў. Дзяржаўны лад у землях-княствахБеларускія землі ў складзе Вялікага Княства ЛітоўскагаЛюблінская унія 1569 г."The Early Stages of Independence"Zapomniane prawdy25 гадоў таму было аб'яўлена, што Язэп Пілсудскі — беларус (фота)Наша вадаДакументы ЧАЭС: Забруджванне тэрыторыі Беларусі « ЧАЭС Зона адчужэнняСведения о политических партиях, зарегистрированных в Республике Беларусь // Министерство юстиции Республики БеларусьСтатыстычны бюлетэнь „Полаўзроставая структура насельніцтва Рэспублікі Беларусь на 1 студзеня 2012 года і сярэднегадовая колькасць насельніцтва за 2011 год“Индекс человеческого развития Беларуси — не было бы нижеБеларусь занимает первое место в СНГ по индексу развития с учетом гендерного факцёраНацыянальны статыстычны камітэт Рэспублікі БеларусьКанстытуцыя РБ. Артыкул 17Трансфармацыйныя задачы БеларусіВыйсце з крызісу — далейшае рэфармаванне Беларускі рубель — сусветны лідар па дэвальвацыяхПра змену коштаў у кастрычніку 2011 г.Бядней за беларусаў у СНД толькі таджыкіСярэдні заробак у верасні дасягнуў 2,26 мільёна рублёўЭканомікаГаласуем за ТОП-100 беларускай прозыСучасныя беларускія мастакіАрхитектура Беларуси BELARUS.BYА. Каханоўскі. Культура Беларусі ўсярэдзіне XVII—XVIII ст.Анталогія беларускай народнай песні, гуказапісы спеваўБеларускія Музычныя IнструментыБеларускі рок, які мы страцілі. Топ-10 гуртоў«Мясцовы час» — нязгаслая легенда беларускай рок-музыкіСЯРГЕЙ БУДКІН. МЫ НЯ ЗНАЕМ СВАЁЙ МУЗЫКІМ. А. Каладзінскі. НАРОДНЫ ТЭАТРМагнацкія культурныя цэнтрыПублічная дыскусія «Беларуская новая пьеса: без беларускай мовы ці беларуская?»Беларускія драматургі па-ранейшаму лепш ставяцца за мяжой, чым на радзіме«Працэс незалежнага кіно пайшоў, і дзяржаву турбуе яго непадкантрольнасць»Беларускія філосафы ў пошуках прасторыВсе идём в библиотекуАрхіваванаАб Нацыянальнай праграме даследавання і выкарыстання касмічнай прасторы ў мірных мэтах на 2008—2012 гадыУ космас — разам.У суседнім з Барысаўскім раёне пабудуюць Камандна-вымяральны пунктСвяты і абрады беларусаў«Мірныя бульбашы з малой краіны» — 5 непраўдзівых стэрэатыпаў пра БеларусьМ. Раманюк. Беларускае народнае адзеннеУ Беларусі скарачаецца колькасць злачынстваўЛукашэнка незадаволены мінскімі ўладамі Крадзяжы складаюць у Мінску каля 70% злачынстваў Узровень злачыннасці ў Мінскай вобласці — адзін з самых высокіх у краіне Генпракуратура аналізуе стан са злачыннасцю ў Беларусі па каэфіцыенце злачыннасці У Беларусі стабілізавалася крымінагеннае становішча, лічыць генпракурорЗамежнікі сталі здзяйсняць у Беларусі больш злачынстваўМУС Беларусі турбуе рост рэцыдыўнай злачыннасціЯ з ЖЭСа. Дазволіце вас абкрасці! Рэйтынг усіх службаў і падраздзяленняў ГУУС Мінгарвыканкама вырасАб КДБ РБГісторыя Аператыўна-аналітычнага цэнтра РБГісторыя ДКФРТаможняagentura.ruБеларусьBelarus.by — Афіцыйны сайт Рэспублікі БеларусьСайт урада БеларусіRadzima.org — Збор архітэктурных помнікаў, гісторыя Беларусі«Глобус Беларуси»Гербы и флаги БеларусиАсаблівасці каменнага веку на БеларусіА. Калечыц, У. Ксяндзоў. Старажытны каменны век (палеаліт). Першапачатковае засяленне тэрыторыіУ. Ксяндзоў. Сярэдні каменны век (мезаліт). Засяленне краю плямёнамі паляўнічых, рыбакоў і збіральнікаўА. Калечыц, М. Чарняўскі. Плямёны на тэрыторыі Беларусі ў новым каменным веку (неаліце)А. Калечыц, У. Ксяндзоў, М. Чарняўскі. Гаспадарчыя заняткі ў каменным векуЭ. Зайкоўскі. Духоўная культура ў каменным векуАсаблівасці бронзавага веку на БеларусіФарміраванне супольнасцей ранняга перыяду бронзавага векуФотографии БеларусиРоля беларускіх зямель ва ўтварэнні і ўмацаванні ВКЛВ. Фадзеева. З гісторыі развіцця беларускай народнай вышыўкіDMOZGran catalanaБольшая российскаяBritannica (анлайн)Швейцарскі гістарычны15325917611952699xDA123282154079143-90000 0001 2171 2080n9112870100577502ge128882171858027501086026362074122714179пппппп