Hi Bela.

I'm not sure how this usually goes but here is my current wish list.

We'd have to discuss whether any of that actually fits into the scikits,

thou ;)

- Multilayer Perceptron and Multinomial Logistic regression

I have been working on that so maybe there is not enough

left to do there for a GSoC. Not sure, though

- Graph Cut Energy minimization

This is an inference technique so I'm not totally sure

if this should go into scikit-learn. Could also be a

candidate for scikit-image.

The main work would be to implement an efficient

max flow algorithm and then do graph constructions

for alpha expansion and alpha-beta swaps.

- Averaged gradient descent

I think this is on everybody's wish list. Not

sure how much work this will be.

I'm sure lot's of people will have to say something to that ;)

See issue #543: https://github.com/scikit-learn/scikit-learn/issues/543

- Structured SVM / CRF learning

This is a big one. Not sure what other people think of it.

I think having a structured SVM would be great.

At the moment, the most commonly used implementation

is Joachim's SVMstruct.

This has licensing issues but talking to him might help.

Another option is implementing optimization via SGD

or, if you want to go crazy, cutting plane techniques

or bundle methods yourself.

Designing the interface is also non-trivial.

One would have to think about whether / how it

is possible to use structured SVMs just from Python,

without writing Cython functions.

- Low rank kernel approximations (Nystrom methods)

This is mainly interesting for SVMs.

The idea is to approximate the kernel matrix with

a low rank factorization and use this to construct

a linear SVM problem.

This is related to the current kernel approximation

module but has a somewhat other approach.

This method makes large scale SVMs fast / possible

- Kernel Perceptron

There is a (I think) pure Python implementation

by Mathieu that could be Cythonized.

That's it for the moment, I think.

I'd be happy to mentor any of the above projects

if the others agree that they are sensible.

Maybe we should update the wiki for the next GSoC?

Cheers,

Andy

*Post by Bala Subrahmanyam Varanasi*Dear all,

I would like to participate in Google Summer of Code this year. Please

let me know the ideas which you would like to implement in

scikit-learn in GSoC - 2012.

Also... I'm attending to Stanford's Online courses - ML class and NLP

class. I believe this is the right time to discuss. Because, I can

learn new things before the start of GSoC and can work on challenging

implementations in scikit-learn.

Thank you.

Bala Subrahmanyam Varanasi

IV B.Tech, Information Technology

Vishnu Institute of Technology

contact number: +919985415959