 For example, how can I generate an image that is corresponding to the caption "a person skateboarding in the street with some people looking on"?