Help - Search - Members - Calendar
Full Version: Database Validity
Bigfoot Forums > Bigfoot/Sasquatch Discussion > Research & Investigation
sasquatchin
This may have been gone over before, if so I apologize.

I hope some of the more math/database knowledgable members so inclined can answer some questions.

Example, BFRO's database. Apparently lately the only requirement to be an 'investigator' is to attend a group camping trip. More and more often there are entries that (at least to me) should NOT be there. I am referring to entries like "I heard something", or "I saw some crossed trees in the woods" or anything not DIRECTLY attributable to bigfoot.
The question is what percentage of invalid entries contaminate the database? 1 entry, or 2 percent, 10 percent?

My feeling is that if 10 percent of the entries are invalid, that makes the entire database useless! Any thoughts?

I have my own 'database' produced over about 20 years in the Southern Indiana area, I know the standards that were applied to each entry, these standards may not be acceptable to others, but at least I know what they are!

As far as I know, SRI is the only group that actually applies standards and a review process to their sightings database. It is still new, and therefore has few entries so far. Are we to rely on the BFRO's database simply because it is the largest? What process could be applied to separate the wheat from the chaff and make it useful again?
Elder
I agree that the bfro data base contians alot of garbage. Just like any other web site, you need something new to continue visits. I only retian information that I feel is worthwhile. I saw a couple of the "investigators" on a local tv program last year and I agree with you on their credibility.
wufgar
I understand the initial point made, but databases by nature will carry 'junk data'. But, that is the data you have and the best tool you have to make any sense. As John Green points out in Apes Among Us - the government regularly relies on datasets they know to contain 'junk' to make conclusions (so Green did same in his analyses).

Truth of the matter is that it almost always balances out if you have any good, representative data. Its a database: it isn't so much concerned with pennies as it is hundred dollar bills. Time and again, and I speak from experience as a DBA, you will notice that even if you whittle out the suspected 'junk data' and then perform analysis, your results do not differ in a significant manner. Usually you just whittle yourself into no data at all, and it it always more helpful to be inclusive than exclusive. I'm no public defender of the BFRO (their DB isn't even queriable), but the probability of 'junk' in their data isn't enough to say it's all crap.
chrisandclauida2
how do you know anything you know about the creatures movement, habitat, activity, interaction with its environment, its build, look, hair color, proliferation throughout the us or the world or any of a million little facts.

the answer is the databases. without them the limit of the collective knowledge would be minute
Bobby Orangeboom
I understand the point being made here BUT, it's obviously down to the Inividual/Investigator & whether they think it's a credible sighting or not & they then add it to whatever database is relevent to them if they feel it is credible.

As far a qualifications go for an actual BF Investigator & who should be classed as one or not, that's a whole different subject within itself !! :wink:
mike2k1
This is a pet subject of mine and one I have a long time opinion on. Data bases are a double edge sword. To give you the jest of my thought I pulled a recent quote of mine from a thread where there was speculation pn poulation size based on information from public databases:

QUOTE
Great article Moregon! I would also like to caution information from public data bases. The reason is if you are trying to formulate a hypothosis of population size based on information from accessable data bases then you are taking in the good and bad information to account for the math. IMO(for a long time) Public data bases are a double-edge sword because individual reports can be influenced by past reports thus skewing the data base. Also you have to take in to account bad investigation and followup, class B reports, misidentifications, ect. It happens. Look at(and I'm not bashing here, they just have the biggest accessable base) the BFRO; great data base sure, but easily accessable....How many reports have information based on a previous report? The data base is accessable and readable by everyone, so you would have to think that somewhere, someone wanting to report a sighting of something, looked a the website and read a few reports and when the filed their report to make it sound like it fits, they might have used a description or some information from a previous report. You got to think it has happened. It's human nature to try and make things fit in with the group. So to answer the question of how many reports are based on previous info; you don't know. Look at another issue. How many reports are vocals, shadows, smells, ect? They are not true sightings. So you don't know for sure that was a sasquatch or some other known animal. When you get down to it, a true eyeball sighting is pretty rare(unless you live on a Tenneessee farm and they borrow garlic from you every now and again...but that's a whole different subject. ). So to count population by reported sightings, isn't going to cut it and if you try to weed out the bad info, class B stuff and go with true visual accounts then the 2-6000 count would be high. I say stick, once again, with finding the one. That one is the most important count.
ray crowe
Have added over 4000 reports published in back issues of the Track Record to my website (still have another six years to add--text of 160 of them posted). Readers must keep Skepticals on for ALL reports. Thus...I report everything: sort into sighting, track, other, and give all a low rating (a 3 out of 1-10...o,1,2 I don't believe). Even the best liars have sightings and even the most honest person errs with a stump. A wise statistics prof once told me to always include everything...you never know what might be important later, or lead to a new understanding of your subject. There other reports misc screams, strange sticks, odors, etc...you never know what might someday be importand or form a trend.
To maximize usefulness, have added approximate lat/long, elv, avg ann temps/precip by month for the locality, along with sunrise/sunset moon phase data (adjusted to Greenwich time). On the Initial site post, have included BF hgt/color, track length, bf activity. All can be easily used without going to the full report.
Many uses for data...for instance; peak BF activity seems to be at full moon (of course)...and during new moon. Other phases, activity drops off. Peak activity drops off BEFORE sunrise, peaks again in the afternoon and evening...then its apparently naptime (see graphs...adjusted by using auto traffic counters to level out sightings).
Check it out Internationalbigfootsociety.com ; Especially if your into doing graphs, comparing different variables...is it possible to prove, for instance, that these things migrate, hibernate...maybe just sometimes? Do they vary their activity by climate, ascend-descend by date or temperature? Lots of stuff to figure out. I'll keep adding stuff to make the database better.
Have recently added Google maps...but find with 4000+ plots that the land surface is covered by "balloons." And...it'ts not plotting right...stuff plots in the Ocean, the wrong state, etc? Checked and lat/long is right. Have Rob Murdock working on it for me.
Ray
socaldave
Hi Ray, great to hear you have done so much with the IBS/WBS datat base. I need to get over and check it out. Is it free? Oh yeah, I need to renew my subscription to your newsletter, been a little short on cash lately! new_specool.gif
Ken Y.
Well someone has to step up to the plate and really design a database that works and has the ability of calling on outside resources.

The emergence of RDF technology (Resource Description Frameworks) it could be done. Having each internet resource (database) converge with one main database taking the lead would further everyones efforts.

Each database involved would need to use RDF which in my opinion would be close to impossible to get everyone to agree to use it. But still in the realm of possibility.

I would be willing to help with a database like that.

Until some organization starts really employing a queriable database that has the Graphical User Interface that is useful we are at a near standstill.

Useful websites in the history of Sasquatching are few and far between.

Ken

P.S

I am currently working on a website that is going to help put research tools all on one site.
If you want your site to be included in the links section please send me a personal message with a link with your site url .

If I were to try and make a database like the one described above, what would the fields that everyone would suggest be included? It would be Linux based, but it would have a customized operating system to speed up powerful reseach tools that would be server side and accessable with a standard internet browser.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.