Galaxy Zoo Talk

Discussion of: Galaxy Zoo and undergraduate research: spiral arms, colors, and brightnesses

  • JeanTate by JeanTate

    Galaxy Zoo and undergraduate research: spiral arms, colors, and brightnesses is the title of a recent (October 29, 2013) Galaxy Zoo (GZ) blog post. Author, and GZ Science Team member, Kyle Willett introduced it as follows:

    The guest post below is by Zach Pace, an undergraduate physics student at the University of Buffalo. Zach worked at the University of Minnesota during the summer of 2013 through the NSF’s Research Experience for Undergraduates (REU) program. Zach is continuing to work with Galaxy Zoo data as part of his senior thesis.

    In this GZ Talk thread, I'd like to discuss this blog post, and so extend the existing discussion in the Comments to a forum better suited to such discussion, as Kyle himself noted (see last comment, that I've reproduced here).

    First, I'll copy the five comments already posted, in a separate post for one each.

    Posted

  • JeanTate by JeanTate

    The first comment is one of mine, stardate November 2, 2013 at 9:39 pm:

    Welcome Zach, and thanks for a great blog post!

    Would you mind spelling out in some more detail some of the things in your blog post please?

    For example, “the difference between the blue magnitude and a red band” – I guess you mean the SDSS u’ and r’ band magnitudes, corrected for galactic (MW) extinction, right? If not, what?

    Also, “ it has been suggested that spiral galaxies with more arms and spiral galaxies with tighter arm winding (which is to say, a shallower pitch angle) tend to be brighter and bluer” – with the caveat “controlled for inclination” (or similar), right? I mean, comparing a highly inclined spiral to a face-on, without “correcting” for inclination, could yield spurious results, couldn’t it? If so, how did you control for inclination?

    In “the population of bulgeless edge-on galaxies has a similar shape to the population of face-on galaxies“, what do you mean by “similar shape”? Shape of what?

    In the bottom diagram (actually a pair of figures), what is the black dotted line? In all the diagrams, what do the colors of regions of the plots mean? There’s a (horizontal) legend, which has numbers on it, but no units; there’s also a vertical legend which seems to have nothing to do with the plots at all.

    Are the ‘r magnitude’s corrected for galactic extinction? k-corrected? What cosmological model did you use, with what values?

    Posted

  • JeanTate by JeanTate

    The next is also one of mine, stardate, in which I express some frustration:

    I guess my comment earned this week’s lead balloon award! 😦

    Zach, if you say this, but have no intention of doing what you say, wouldn’t it be better simply to remain silent?

    If you have any questions, feel free to comment below.

    OTOH, if you’re simply busy, but will certainly be responding when you have a chance, might I suggest a quick “Glad to read your comment Jean! I’m busy right now, but I’ll definitely try to answer your questions when I have time.“?

    Posted

  • JeanTate by JeanTate

    Early the next day - or perhaps later the same day (November 9, 2013 at 1:05 am), Kyle replied:

    Dear Jean,

    I’ll answer what questions I can. Zach is a full-time student at another institution and doesn’t get updates when comments are posted, so I doubt that he has seen your questions yet.

    First: yes, this can specifically extinction-corrected (u’ – r’). Color-morphology relations are seen in many magnitude pairings, though. It’s more of a general statement.

    Zach’s work has controlled for edge-on vs. face-on disks using the GZ2 data, which we’ve calibrated against the axis ratios in SDSS. His work on spirals is for face-on disks only.

    In the CMD space, the distributions of both colors and magnitudes for bulgeless, edge-on galaxies and that of bulgeless, face-on galaxies are similar. A lot of our work has been finding better ways of quantifying how similar the distributions are, building on things like just comparing the mean and variance.

    The black dotted lines are simple fits along the most populated bins in each magnitude bin; we also worked a lot on doing more formal fits to more formal bivariate distributions. We’re not claiming yet that these lines show anything really of use, but they help draw the eye to the outline of the galaxies. Colors show the number of galaxies in each bin, weighted by the probability of the morphological feature. The hor/ver legends are the same.

    The magnitudes are extinction and k-corrected, with WMAP7 values. Choice of cosmology doesn’t make a very big difference in the SDSS main galaxy sample, though.

    This is work in progress, so please take it with a grain of salt. Zach did an excellent job working on the Zooniverse, but none of this has been published yet. We’re excited that more analysis will hopefully lead to papers and formal results, but much of this may change in the future.

    Posted

  • JeanTate by JeanTate

    To which I replied a couple of days' later (November 11, 2013 at 5:02 pm):

    Thank you very much Kyle!

    Zach seems keen to have a discussion; is there a way I can notify him (send him an email perhaps)?

    Might it be better to have the discussion in GZ Talk? or the GZ forum? The GZ blog’s comments are extremely limited in their capabilities (unless you know what HTML tags work, and what don’t).

    Do you mind if I ask, what tool/app did you use to produce the colorized plots? The color scheme reminds me of elevations in atlases (all those ‘white peaks’ in WA!) …

    Posted

  • JeanTate by JeanTate

    And here's Kyle's response, the last comment (as of now), November 12, 2013 at 10:17 pm:

    Sure – I’ll forward your comments, and let Zach get in touch with you himself. I believe he’s at a conference this week, so I don’t know if he’ll be able to respond immediately.

    Spreading this on Talk (or the forum) would be excellent – it can be hard to have real discussion in blog comments, and it’s exactly what we designed Talk for. If it hasn’t been done yet, I think it’d be ideal to cross-post the main content of the post, plus your questions – Zach, I, and the rest of the GZ community can weigh in more there.

    Posted

  • JeanTate by JeanTate

    So, welcome Zach, to GZ Talk! 😃

    A repeat of an open question, to get the ball rolling: what tool/app did you use to produce the colorized plots? The color scheme reminds me of elevations in atlases (all those ‘white peaks’ in WA!) …

    And please, my fellow zooties, join in the discussion!

    Posted

  • KWillett by KWillett scientist, admin, translator

    Hi Jean,

    Hopefully Zach will also join us soon. Tackling the first question: these plots are made using matplotlib, which is a Python plotting library. Python is rapidly becoming very popular in astronomy as the language of choice over IDL; it's free, and people (including me occasionally) are building up lots of pre-made routines and libraries for astro-specific routines. Zach's images are 2-D histograms, where the color of each bin represents the number of galaxies on the x- and y-axis. The contour plots are smoothed interpolations of how many total galaxies are within it.

    Here's a link showing some code of how these are made in Python, if anyone is interested. http://micropore.wordpress.com/2011/10/01/2d-density-plot-or-2d-histogram/

    Posted

  • zpace21 by zpace21

    Thanks, Kyle and Jean!

    As the linked article says, colorized histograms are handy for visualizing dense data, because dense plots of points aren't especially useful. In this case, the count in each 2-dimensional bin is not purely the count of galaxies. It is the sum of the vote fractions.

    So, if we're histogramming 2-armed spiral galaxies, and in a given bin, we have 5 galaxies with vote fractions (percentage of people who voted that the galaxy had two arms) 0.58, 0.79, 0.88, 0.66, and 0.71, then the value for that bin will be 3.62. This is a bit of a simplification, since I pre-applied some cleaning routines about which galaxies are used, but I think you'll get the idea.

    It's worth mentioning that there are a bunch of color mappings available (so I could specify a greyscale color scheme, or a red-to-blue, or really anything else I wanted).

    Posted

  • JeanTate by JeanTate

    Again, welcome Zach! 😃

    Thanks too for the answers to my questions.

    So, if we're histogramming 2-armed spiral galaxies, and in a given bin, we have 5 galaxies with vote fractions (percentage of people who voted that the galaxy had two arms) 0.58, 0.79, 0.88, 0.66, and 0.71, then the value for that bin will be 3.62. This is a bit of a simplification, since I pre-applied some cleaning routines about which galaxies are used, but I think you'll get the idea.

    Cleaning, ah yes, a topic dear to my (Quench) heart! I would be very, no scratch that, very interested to hear some details (also see below). Presumably the vote fractions are weighted; are they also de-biased?

    Here's 'below': I also copy/pasted questions to a Talk thread, questions I had posted, as comments to another GZ blog post: How to deal with 'blending' and 'shredding'?. I'm guessing this is something you had to address, to ensure a uniformly, consistently clean sample to begin working with.

    Another aspect: I found some evidence that Eos (edge-on spirals) in particular sometimes have their disks ripped off truncated by the SDSS photometric pipeline, resulting in inaccurate magnitudes and - especially - colors (I wrote about this in a GZ forum thread, What are the galaxy's ugriz magnitudes?). In light of this, from the blog post:

    This observation is also borne out in edge-on disk galaxies: the population of bulgeless edge-on galaxies has a similar shape to the population of face-on galaxies, albeit with stronger reddening on the bright end.

    I'm guessing that you had to address this 'SDSS pipeline can shred Eos, resulting in erroneous u-band mags' problem; may I ask how you dealt with it?

    Posted

  • zpace21 by zpace21

    I'll attempt to answer some of your questions below, although I admit some hand-waving will occur:

    Cleaning, ah yes, a topic dear to my (Quench) heart! I would be very, no scratch that, very interested to hear some details (also see below).

    To my understanding, the "cleaning process" amounts to a cutoff number of total participants required to make sure results for a given object are statistically valid. So, for example, if Galaxy X is voted by 40 people to be disk-dominated, and by only 10 people to be bulge-dominated, then perhaps it is not so wise to consider the classifications further down the "bulge-dominated" decision tree. So, in order for a given Galaxy Y's classifications for fields like "arm multiplicity" to be considered, all of the higher-up classifications would have to also be statistically valid. I apologize if I'm describing this poorly, but the process is rather strange--it took me about a day to wrap my brain around it. There's a discussion of it in the GZ2 DR paper in Section 3.3. The values I used for cleaning are in Table 3. Kyle will have to answer most additional questions.

    Presumably the vote fractions are weighted; are they also de-biased?

    Yes to both, although I would (again) page Kyle for exact details. I know that individuals' votes are weighted in-bulk based on how well they agree with everyone else's, and I know this is an iterative process, but not a whole lot more than that. There's also debiasing done, but it's done further up the data pipeline, so I'm afraid I can't offer much help there, either!

    As for blending, shredding, and the Eos problem, I have not intentionally corrected for any of those. They may be corrected further up in the data pipeline (paging Kyle again!). Either way, I will try to do some reading over my semester break, and see if those issues should/can be corrected. Sorry I can't answer those questions right now, though!

    Posted

  • JeanTate by JeanTate in response to zpace21's comment.

    Thanks Zach.

    That helps me understand what you (and the data pipeline 😉) did, but it would seem to have no applicability to 'Quench cleaning' 😦 At least, no such cleaning that any of the ordinary zooites could do (we don't have access to the vote counts).

    Have you considered looking into the 'cigar-shaped' galaxies? To see to what extent you could 'recover' those which are, physically, highly inclined disks (even Eos)? Perhaps the distributions of those which are very likely disk galaxies resembles that of the bulgeless, or barely-noticeable, disks? Cutting by (SDSS) ellipticity (or is it elongation? I don't recall) would - should! - give you a clean sample of disk galaxies (there are no ellipticals more elongated than E7, and even those may all be lenticulars).

    Posted

  • zpace21 by zpace21 in response to JeanTate's comment.

    I think Kyle and I discussed something similar to what you're talking about when I was in Minnesota over the summer...

    It may be interesting to work some statistics on ellipticity: that could be accomplished by simple cuts on "cigar-shaped" vote fractions and "ellipticity/elongation" and coming up with a correlation coefficient or two between colors and magnitudes. If we want to compare the distributions themselves, though, that is trickier. In fact, I'm working on techniques for doing that right now!

    Posted

  • JeanTate by JeanTate in response to zpace21's comment.

    Cool! Maybe you could look at the "In between" ones too?

    More generally, may I ask how you are getting a handle on an observation bias that may be crudely stated as "disk galaxies with spiral arms - in SDSS/GZ2 - are merely the most obvious x% (where x may be as low as 10?) of such galaxies"? Or, your distributions ("The distribution of colors and magnitudes for galaxies are statistically similar, no matter what the number of spiral arms") may not be telling you much about what spiral galaxies are really like?

    Posted