math - Softmax Implementation in C++ - Stack Overflow

link之家

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 4 years ago .

Folks,

Are there any example of the implementation of a simple softmax function for N values? I've seem things like "softmax-based detectors" and so forth, but I just want to see a pure, straightforward C++ softmax implementation.

Any examples you know of?

Thanks,

or google... codereview.stackexchange.com/questions/177973/… stackoverflow.com/questions/9906136/… – OznOg Oct 19, 2018 at 17:19 Sure it can be implemented in a number of ways. The implementation will depend heavily on how you're representing your data, which could be


    vector<T>


    array<T,N>

, some pointer array, or even some library-specific thing like TensorFlow. It would help you get a good answer if you showed how you're representing your problem, what you've already tried, and where exactly you got stuck. – alter_igel Oct 19, 2018 at 18:36 @JesperJuhl It was not a comment for you, rather for the asker who has responses to his question in SO already – OznOg Oct 19, 2018 at 18:47

I haven't seen a library implementation of softmax, although that's not proof that it doesn't exist. It's simple enough that people just write their own when they need it.

For the record, the softmax function on u1 , u2 , u3 ... is just the tuple (exp(u1)/Z, exp(u2)/Z, exp(u3)/Z, ...) where the normalizing constant Z is just the sum of the exponentials, Z = exp(u1) + exp(u2) + exp(u3) + ... .

Note that adding or subtracting a constant from each u leaves the result unchanged, since it's equivalent to multiplying above and below by the same factor. So you could make the calculation a little more numerically well-behaved by subtracting the greatest value among the u 's; then the largest term exp(u) will be 1 and all the others something smaller than that.