Topic 7 is designed to help you understand the value of data and data management tools and how they can be applied. Very little of what we do today as an Information Society would be possible without the massive investment in database tools, techniques, and technologies … most of the knowledge we come across in our daily lives is stored in some sort of a database structure. Please read topic 7 so you can understand that there are different types of databases for different reasons. I have to say, you cannot solve our challenge without a complete understanding of database technologies — you just can’t do it. Pieces of what you learn in this topic will really allow you to solve it.
The topic begins with an overview of the importance of data. Without data, computers are of little value. The worth of a computer or an information system is measured in terms of its ability to support the processing, transmission and/or accessing of data. In fact, data management can be the key to success or failure for an organization. Many companies run into difficulties because they lack the appropriate data to make effective decisions, and/or the way their databases are constructed inhibits the ability to analyze and manipulate data in a timely, cost-effective manner. It goes on to lay out the basic terminology associated with databases … this is a critical bunch of information as it really gives you the basics.
From there, it goes on to discuss the characteristics of databases and on to a very importnat lesson related to database management systems. Make sure you finish out your reading as it gets into some fairly complex stuff … the key here is that databases are the main ingrediant in so much of what you as a knowledge worker will need to know to solve large-scale IT challenges. Don’t miss reading this topic! When you finish your reading, respond to the discussion activity below … and remember, your challenge asks you to decide on a database sytem to power your solution — the information in this topic should be a starting point for you to do just that.
Discussion Activity
At times, we collect data from our employees, peers, customers, and friends. As knowledge workers, we are expected to sort through data and come up with meaningful information. There are many things that can jeopardize the meaning that we intend to pull from data, especially when data comes from surveys or some type of human-generated information (interviews, reports, journals, or books). The respondents’ truthfulness, the quantity of responses, possible bias, and time constraints can lead us to “stories” that might not be valid. Can you think of some other problems that may occur when trying to extract meaningful data? How do you plan to only extract valid and reliable data so as to create meaningful information?
Not having all the data, data of poor quality, incomplete data, and non-standardized data would cause problems when trying to extract meaningful information. In order to be able to sort through data and make something meaningful out of it, I would have to assume that all the information was accurate and valid unless I had some information to tell me otherwise. By this I mean if I were given data to sort through, I would assume the information is unadultered. If the data was somehow doctored, than a complete and accurate composition would not be possible without some degree of uncertainty.
The DBMS, database management system, allows users to create, edit, store, update, delete, and most importantly organize and make use of data that would otherwise be unused or inaccessible to company employees or anyone for that matter. I feel this type of organization system is exrememly important and useful and if used efficiently can become a necessary asset to a corporation. As shown in the reading, it can be a very useful tool for inventory, mailing lists, etc., but when looking at the data that is being gathered and submitted in these systems, there can be problems.
When looking at the questions of: What are some problems that might occur when extracting meaningful data? I think trying to keep data valid can be a difficult task if one allows for their own toughts/viewpoints conflict with data collected. I would try to minimize the risk of my own bias being involved with whatever I was trying to obtain information for would be to ask others if they felt the questions were fair and non-judgmental. The subjects answer in response to a question will probably have some bias to them. Everyone views things differently, so trying to minimize participant bias, I think would be extremely difficult. As for incomplete data, I would eliminate that all together from my data pool.
When extracting and analyzing data you must know what you need the data for, which will help you pick out what you need from a particular set of data. The data that you determine is useful for what ever reason must also be complete. If your data is missing important features to it then you won’t be able to interpret it as well as you probably should. The size and source of your data could also cause a problem with increasing your chances of using the wrong material causing a waist of time and resources that could be limited.
When we are trying to extract meaningful data, there might be the subjective basis from the interpreters that influence the finding. For example, when the experts are hired by the company to extract information for them, there might be a basis since they are paid to do so.
Another problem that might occur is the Auspices bias, which is the response of the subjects caused by the respondents being influenced by the organization conducting the study. For instance, when the Gun Association is doing the research on the ‘gun usage’, their result would be very different when the research is done by Welfare Department. Hence, people tend to be influenced by the organization, which conducts the survey.
In order to extract valid and reliable data, I would plan to conduct the survey avoiding the biases in the interpreter and the Auspices bias. I would well train the interpreters and provide guidelines and rules for interpreting the results. Another possibility to avoid interpreters biased is letting outside marketing research company do the job. Also, I would send out anonymous group for conducting the survey. So the respondents will not be influenced by the organization during the interview.
I know from working with and seeing databases that retrieving online data can be a tricky task. If the user at any point fills in the wrong field with the wrong information or raw data, that could screw up the entire survey, order, or whatever they might be filling out. This incorrect information, once stored, could also prove other data entries useless. My point is, when retrieving data from users, the programmer or whoever must make it as easy and as clear as possible what they are asking of the user. Even after the explanation and instructions are given to the user, the programmer should apply safegaurds just in case the user does mess up. For instance, if the user is ordering something and inputs the incorrect credit card number, the programmer should create a safegaurd that informs the user of their mistakes. This might be something as simple as making sure the credit card number and the name given match up.
The scary thing is that with all the data traveling around on a daily basis, a simple mistyped character or information in the wrong field can really give false information.
First off, in order to extract meaningful or valid data the retriever or person who desires data needs to make sure the source or person they are getting the data from is indeed reliable and valid. If the user is a chronic liar or person of strong bias, they may produce useless information. Also, I think in order to retrieve valid and useful data, the programmer or retriever should make the questions and what is expected of the user extremely easy to understand. They should use an easy to follow interface as to not confuse the user. They need to emphasize the importance of honesty and being either impartial or partial, depending on the information desired. Perhaps, most importantly they need to make sure the data can travel efficiently and securely to the database where it is stored. If data is ruined, stolen, or changed during the process of storage, then it is useless information.
Overall, while many factors such as user integrity and ease of data extraction play important roles in retrieving valid data, I tend to feel the most important role in the entire process would have to be data security and information assurance.
When extracting meaningful data, there are many problems that could possibly come about. There are issues of lost data. If a database goes down, sure there is a backup, but its not as accurate as the one that went down. There are several methods used to fill in the gaps of data but it will not be as accurate as the original database was. Another possible problem is one of security. The question of who has access to the information and to what information do these persons have access to comes about. Once a person extracts data, they need to organize this data in a good way. Otherwise the extracted data might be just as useless as raw data would be. Keys help you extract the data that you want too because you can search by a certain key element. Another thing to keep in mind is who has done the data extraction. Don’t forget statistics lie. If the person extracting the data has a specific point to prove, then they can pick and choose what data to use in their final report misleading others by manipulating the data to their own advantage.
When extracting date many problems can occur. The biggest problem is false information. People who lie in the research can throw off the data. You must decifer between what you think is true and what is false. Lost data can cause a problem as well. Even though the data can be backed up sometimes some of the information is lost and therefore the data can be misconstrude. Some people might inturprit the material incorrectly making the results go a different way. When researching people to take the questionair or whatever is used, there could be a problem with different peoples thoughts and they could think that you are asking them something different. People who are doing the research for the data must know how to pick out the good and the bad and make sure that the data will come out as is it suppose to be and not have, I don’t want to say any problems but, little errors in their data.
As we continue to push our limits with technologies we must be careful with what we do with the data and make it as productive as possible. There will always be mistakes or problems that arise but we must know how to correct it. Fraud is a huge concern that can become a major mistake from data input from companies and people. Computer programmers must know how to prevent false credit card claims and make sure that the right person is entering the data. Privacy is key in preventing fraud and to do so the companies must secure their network databases in order to keep public confidentiality. Every person has the right to keep their private information confidential but sometimes that is not always the case. Companies may sell their databases of information to other companies to use it in a study or contact people without the user’s consent. Updating this information can become another problem that may make databases inaccurate or useless. Information is always changing constantly and it is very hard to keep instant updated databases because people are always moving addresses, there may be a death, and babies are constantly being born. There are several problems that can be accounted for but the researcher must know a way to prevent these problems or else he will be stuck with unreliable and useless data.
Extracting valid and reliable data is harder than it may sound but planning out the data you want is a great first step. You must make sure that people are able to understand what you are asking for and that they can give you the truth. If people are confused they will tend to get uninterested with the data and just start filling in anything. You need to find data that will help you keep your results unbiased and by doing so you have to ask several kinds of people and not just stick to one group of people. Hiring experts to look at your data can be a good solution to find skewed results or to find out that something is just not right.
Finding the right database management system (DBMS) is most important in creating and updating information data. Allowing the users to create, edit and update data in database files is useful in making sure you have correct and updated information. The ability to store and retrieve data from those database files allows the researcher to access the information and to use the meaningful data. It can bring a success or failure effect on an organization. Databases will never become extinct because information would be very difficult to analyze for organizations.
Producing meaningful information takes much effort and organization. When a worker is attempting to collect data, many factors must be taken into consideration. The worker must figure out who will be the targeted audience, what will be the most effective way of gaining information from them, and how will the worker make sure they are truthful.
When attaining human generated information, it makes sense to have people who are familiar with the topic being discussed. If the respondents have an interest in the topic then they will be more willing to take their time and answer appropriately. This way, the audiences opinion will carry more weight and will most likely be more truthful. With a little bit of effort from the worker, he/she will find the right audience who will give meaningful data.
Obtaining valid information from respondents can be difficult. The worker can obtain data through surveys but that is very impersonal and possibly invalid. Interviews are personal but they are also very time-consuming. There really isn’t a perfect way to gain human generated information. It depends on the situation, the number of respondents, etc. The most effective way to go about collecting data is probably a mixture of surveys and interviews.
Lastly, the worker needs the respondents to be truthful. This is where most of the worker’s effort and organization comes into play. If the worker has researched his target audience and has organized an effective way to keep the respondents interested, then he/she has done their best to obtain truthful and significant information.
Meaningful information is the only type of data that can be useful in its interpretation. If the data isn’t concise, it is basically useless.
To help solve the problem of collecting meaningless data, I think begins with those conducting the study. If they are passionate about the study then the results and how they will obtain the data will be taken seriously. Also, the user must be familiar with the technology they are using because without the data, computers lose a lot of their potential. Other problems can occur with collecting data. First, the data can only be interpreted if it is entered into the database correctly. I learned in my statistics class how just 1 simple entry can throw off a whole correlation or ruin the mean. Also, you have to trust that the sampling of people from which the data is being collected is truthful. If they didn’t take it seriously, then that data seems kind of worthless.
Data is how we survive in the business world. We collect, analyze, and make decisions based on company status, progression, opinions, and so much more. Being able to comprehend the information and use it to your advantage can keep you a step ahead of those who don’t pay attention to it.
Extracting meaningful information out of a database has two real main problems so far as I can see.
For one, there is a lot of room for user error when entering so much information into a database. Many times a persons only job is to sit at a desk all day and punch numbers and names into a database — I know personally, I am a pretty good typist and I still miss mistakes all the time. A simple mistake in a database can completely throw off certain queries somebody might try to pull out of a table.
The other glaring problem I see that can come up is that many times the wrong information can be given or pulled to try and make meaningful facts out of. It’s important that the information entered into a database contains all things that would ever need to be searched for at a later time. If all you enter about a person is there name and phone number, and you’re trying to find people of a certain interest or area to call, you can’t possibly know which ones to use. The same goes for receiving bad information to put into the database. This can happy for a bevy of reasons, be it, untruthful data, or just bad polling, possibly even mixups while gathering data.
It seems to me like the software to use these databases works well, but it’s more of an implementation problem than it is anything else. If there were more safechecks in these systems and more failsafes as far as the data gathering went, these databases could prove even more beneficial than they already have.
One of the biggest concerns for acquiring meaningful data is the idea of the persons not capable of providing that data because of their lack of knowledge on the subject. Another huge concern is for those who just don’t care about the topic. Collecting meaningful data from these types would be almost impossible not to mention pointless. If you don’t have a plan of action, that will also provide difficulties for retrieving the specified data.
Certain technologies are also necessary for collecting at least reasonable data such as computers, communication hardware/software, etc.; also consider group interaction and cooperation. In order to retrieve retrieving “valid†and “reliable†data, you have to incorporate some (if not all) technologies listed above. Conducting surveys, researching online and in libraries, and interviewing intelligent individuals are all excellent starters for action towards the collection of information.
There is tons of supposed meaningful information on the internet, company’s databases and your own. When you, a company or whomever take this information, the biggest problem is how do you know if it is meaningful data and meaningless? And because reports, websites, or books are written by human beings they are often skewed and the information they gave out is not valid. Other problems come from receiving information is; that you, as customer can be subjectively put into a category you don’t want to be in or for some reason shouldn’t be in.
To help ensure “appropriate” customer service, some companies have initiated a grading system for their customers. So the customers with the “top grade†get the most attention from marketers and best service from companies. So whoever has the most money gets treated the best and people that are struggling get to stay that way. That also means the people with money get annoyed with calls from companies and people without the mullah get nothing. This “meaningful†information is produced from the money or the spending habits of a person. Other supposed meaningful information can come from people’s ideas or opinions that are taken as facts. The best example of this is children or students using the internet.
The only way to make sure that what you get from the internet is “meaningful†data is basically to make sure the information is from a reliable source, reliable in that, from a source that is known for accuracy in that subject. Such as, getting information for French and Indian War the source should be from a history teacher or a well known author in that subject. As for companies harassing their best customers is accurate and a good use of meaningful data because those people with the “top grades†do have the most money which means they do not have to be frugal with their money.
Well, obviously as a knowledge worker, I would make sure my employer was set up with a DBMS (Database Management System. This would allow for an improved availibilty, minimized redundancy, accuracy, program and file consistency, user friendly data, and improved security. This would also assist me in the task of extracting meaningful data.
Truthfullness, quantity of responses, possible bias, and time constraints can lead to “stories” and these “stories,” may not be valid information. Other things that could elevate the chance of extracting bad information would be not caring about what you are trying to find out, what the meaning is, the researcher must, also be educated in the data he is trying to find out, otherwise, people will not take it seriously. When looking for data, we should be doing research that is randomized to try to show off a particular population. Things that can help us do this are a bigger sample size and making sure we aren’t only getting a minority view. Also make sure, that survey and interview questions are not easily opinionated, you want a firm yes or no, or some sort of quantitative data that can not be easily manipulated, so as to elevate the chance for truth. Once we have this data, we should make sure we manipulate it into databases in the most effective way. We should make sure it is efficient and effective, by incorporating smaller sets of dat that are easy to search through. We should also make sure that accesibility of data is constrained within different groups. This will also allow later studies to not have bias based on seeing other peoples data.
All of these things will assist in making extraction of meaningful information, but it is almost impossible to hope that all data will be perfect. There is always going to be people that aren’t truthful and are biased. The researcher needs to do his/her best to try their best to get good data that can be easily turned into meaningful information. Also, one of the biggest problems with databases and such is that people make mistakes, once again not a perfect entity and any little mistake can throw off a whole dataset.
Getting meaningful information from data can be a hard task to accomplish. Not only can you have bad data from individuals who contribute false or bad data, but the way the data’s interpretted can play a major role in getting meaning from data.
When a survey is sent out, or when people contribute data in other ways to a business or other organization, there is no way to tell if the data coming in is honest information, or if it is completely different than the intended information. You can’t trust all people to be completely honest with the information they’re sending to you. One reason for this false information may be that some people just don’t care. If they get a survey that they just don’t want to fill out, they just may start answering questions without even caring at all.
I believe the other key concern about extracting meaning from data is the group of people interpretting the data. What happens if a group mis-interprets data, and causes a busniess to lose money because of a bad decision. That information gathered wasn’t very meaningful at all to the company. Training people to use the data and interpret it may be a way to possibly stop this type on non-meaning to happen.
If a business were to hire people to sort through data before it was to be put into a database, the amount of bad data coming in might be decreased. If these people were to be implemented, a business would benefit from only the good data entering their database. If there was no way to filter data before entering the database, who knows how much bad data could make its way into the system, and no business or organization wants to make decisions based on bad data.
Overall, there is no way to completely stop bad data from coming in. There will always be those people who purposely supply false information. Training employees to interpret information to use effectively and also training people to look for and get rid of bad data may be key ways to stop bad data from entering a database and more effectively interpretting the right data to get meaningful information from that data.
One problem that can occur with data collection involves sampling variation. Whenever we work with samples selected from populations, a certain amount of random variation is always introduced. This is unavoidable but must be accepted because the variation if random or due to chance. To minimize this problem, larger samples should be studied. The larger the sample size, the smaller the sampling variation. Another way to avoid this could be the total the amount of variation and then decide whether the results appear to be reasonable and useful or not.
When trying to extract data there may be some variation in the instruments themselves. It is important to make sure that the same type of instruments was used throughout the study. Proper calibration of machines is absolutely vital before data collection or measurements are begun if machinery of any kind is involved.
In order to create meaningful information, it is important to make sure that selection bias has not occurred. This happens when there is a systematic difference between the characteristics of the people selected for the study and the characteristics of those who are not. For example, if a clinical trial is being done to evaluate a medical treatment product and older people are excluded from getting the new drug, selection bias has occurred. Older people may have a poorer prognosis and the effectiveness of the drug may not be truly represented.
Accurate data collection also depends on the least amount of human error. Often times variations in judgement occur. To avoid this as much as possible, it seems prudent to assign more than one person to a task. This will allow for a type of group consensus and help avoid errors.
In order to create meaningful information, I plan to examine each step involved in the data collection as much as possible. Reliable data interpretation involves having a good collection plan. This should include a description of the project, a clear explanation of the specific data that is needed and a rationale for collecting the data. Each step in the collection process should be explained and easily examined. Procedures for analyzing the data should be spelled out and open to scrutiny by others. Information about the integrity and reliability of the data collectors should be obtained if possible. It is also important that I work carefully and accurately. I plan on incorporating safeguards and checkpoints in all of my work. If all of these steps are followed, my results should be as accurate and reliable as possible.
Having the correct information is very important to a company. When a large corporation has thousands of pieces of data in a large DBMS, a lot of the data depends on other data. Therefore, one small mistake could cause a large portion of the data to be wrong. While trying to extract meaningful data, individuals run into many problems. Bias, time constraints, and quality of responses are all problems that workers can run into, as well as when the person collecting data puts his or her own viewpoints into the data being collected or when they have an influence on how the person giving the data will respond. When incentives are involved, the respondent is more likely to give truthful information. If the customer has a large stake in the correctness of the information, for example if they are applying for a home loan or other large loan, they are more likely to be careful when giving their data. If it is for something smaller such as a credit card purchase or something of less importance, they may give false information without trying. Such problems can be avoided by using incentives. Standardized forms, read by a computer, are also more effective in reducing human error. This process requires less human interaction and therefore alleviates the problem arising from a worker reading hundreds or thousands of pieces of data per day. Workers get tired, careless, and pissed off and don’t put as much effort into the reading of the data. Machines, on the other hand, can read the data 24 hours a day and still not make mistakes. The only problem that can be made here is when the person entering the data onto the sheet makes a mistake. Data is boring, and nobody likes to handle it, but it is possible to extract meaningful data if the right operations are used.
Well since we are speaking about our class project, one thing is that the file that the DB finds may be corrupted. For instance, anyone that has ever used Kazaa, knows that probably 6 out the ten songs you download turn out to be crap.
Another problem with DB can behow efficiently the file contained in them are named. For instance, if I want to search for a Britney Spears song, the DB will bring up files with “britney” or “spears” in the file name. But if you have a person that names all of their saved information as “britney spears”, the DB will bring up files not pertinent to the user search.
To combat this problem, really the only ways are to go through the pieces of information careful. But the best ways are to place integrity constraints on all the files contained in the DB. This is where the “WHERE” command in SQL comes in. It allows us to sift through the wealth of information in the DB (i.e. “I want all Britney Spears songs that are less than 3 minutes”).
Other than that, not a whole lot can be done, at least to my knowledge.
There are other problems than truthfulness issues in database analysis. Those problems can occur from having non-standardized data. But the thing with non-standardized data is that if it occurs, it’s really the fault of the programmer. The database should have everything programed so that when someone would answer a query in a incorrect/nonstantardized manner a default error message should occur. There are still problems extracting meaningful data when everything is correct. That can come from sample sizing, and population of your database. If your database is small, and you extract information from it, your data may be skewed because your pool may not be represent the general population.
When thinking about the question “Can you think of some other problems that may occur when trying to extract meaningful data?” theres a lot that comes to mind. It can be anywhere from how you varied your samples you were extracting to whomeever you were sampling provided false information, didnt complete the survey or there was a time problem. But another problem would be how you target your audience and spread out the surveys. There has to be some sort of random compiling done to get that information so its not biased as well. I also think other problems can be maybe biased questions or sometimes when Im doing a survey i dont fully understand whats being asked and just answer the question. So I think that a lot of planning and breaking down of what exactly you want out of the surveys need to be done to prevent any problems.I dont think there is any way to get perfect date, theres always a margin of error. But planning out your work carefully and generating a populated list will reallyl help cut down the margin of error, but theres no way to truly to eliminate people lying and so forth.
Sometimes people ask the wrong questions to gain specific data on surveys or interviews that gets interpreted differently than what it really meant. Sometimes people also take parts of data leaving out key elements. For example, George W. Bush gave a statement earlier this year talking about terrorist threats, “Our enemies are innovative and resourceful, and so are we,” he said. “They never stop thinking about new ways to harm our country and our people, and neither do we.” Now really I’m not even sure what he actually meant in that statement, but if some one only heard that part of his speech then of course there would be no way he would get elected again, right? Well his next statement was about never stopping to think about defending this country trying to clarify his other statement. Data is a sensitive thing and it can manipulate the way people make decisions by feeding them data that is true, but is also not complete.
The more data there is and the more it is organized the better the outcome of a decision will be. However some data should not be accessible to everyone, although it can be arguable such as the case in the physician example in the reading.
One problem about data recording is that sometimes data can change and it could be difficult to apply those changes such as a change of address, phone number, or name. To extract meaningful data you must make sure asking the right questions for the data you are trying to obtain, make sure you include all the facts, and organize everything properly.
People are always going to give bias respones to questions. Depending on the mode they are in, if they have had a good day or a bad day, married or un married, guy or girl, all of these play a role in the response of the person.
With all of the new emerging tecnology, I feel that it should make it eaisier and more difficult to collect accurate data. Eaiser in the way that more people can be reached in less time and the researcher can make it very easy for the person to resopnd. Harder in the way that the study will be less personalized and it is much eaiser for the resondent to just click through buttons really fast to get done with it without ever really reading or understanding.
There is no real way to solve this problem. If you get a researcher who is really excited about the research, they might be bias towards the results, and if you get people to respond who you already know the answer, then there is no point in the study.
It is tough to say how I will find a way to get meaningful data, I think that I would always use a large study group. That way the more people resonding the better idea of which way to go you have. The only problem is that the lager the group, the larger margin of error your study has.
Extracting meaningful data can become a difficult task, especially when you are unsure of the accuracy of the data. Data accuracy can be influenced by factors relating to how the date was collected. If in a survey format, people are usually uninfluenced by others and have a higher probability of telling the truth. Whereas in a focus group or other types like that, people are usually influenced by the presence of their peers. This can lead to inaccuracies, but the interviewer should be able to distinguish accurate information from that of people just going along with the group.
Some other possibilities of obtaining accurate information are:
1. People do not understand the questions being asked like in a survey or do not understand what kind of response to give.
If people do not understand what is being asked of them, they most likely will give inaccurate information that will not pertain to the question asked. If this information is then used when compiling the data, the results will be inaccurate.
2. People might not want to participate in offering up information about themselves and their habits.
If the surveying group is unable to obtain enough people to participate in the surveying process or other types of informational gathering, the data will be inaccurate. The results will be skewed in one direction instead of being a total representation of a population. This will yield inaccurate information to the database, if not taken into account when entering the data or performing statistical analysis on the data.
To extract meaningful data, researchers need to be careful of who they are receiving information from. If they do not take into account the above factors or the factors listed in the question, then data will be inaccurate. If people do not understand the questions, then time needs to be taken to rewrite or rephrase questions so the majority of the population can understand them. If there is not enough people willing to participate in information giving, proper statistical analysis’s need to be take into account.
I think if information gathering is done properly, data can be accurate. There are just a multitude of factors that one needs to be aware of and take into account.
When gathering data from interviews, reports, journals, or books it’s difficult to know whether the data is accurate or if it’s a made up story. As a knowledge worker it makes our job of collecting data timely and complicated. We have to be able to sort through data determining what is false and what is meaningful.
Data is said to be only as valuable as our ability to access and extract meaning from it; and we cannot extract meaning from it without organizing, storing, and analyzing it effectively. When looking at data we have to consider many things such as; the respondents’ truthfulness, the quantity of responses, possible bias, and time constraints. By looking at these possibilities it enables us as knowledge workers to pick through false data and determine what is accurate.
When looking at data the knowledge worker has to be informed about what their looking for. By having a good, solid understanding of the data their looking at, they’ll have a better understanding of what is accurate. Also by gathering as much information and from many different reasources can help you determine if there is a false pattern in the data. When looking at data such as surveys you need to know exactly how the survey was administered to determine if their were any biases or falselyhoods. Each type of data needs to be organized and researched throughly as to how it was administered, to who wrote it (reports, journals, etc.), to the time in which it was taken. By evaluting each piece of the research and knowledge worker can determine how accurate the data is.
Gathering data can be a huge task, mainly due to the fact that there is so much false information in the world because of simple human errors to mechanical computer errors. As knowledge workers it is our job when doing research to make sure that the information we’re using is accurate. In order to that you have to be willing to complete the steps to extracting the meaning from the data you have.
Being an economist, we deal with statistics everyday. Most of our data relies on a density of information that we use to compile results. But with information you will always have a mean, and deviation from the mean. What we want to do is to get enough participants so that we can narrow this deviation from the mean. An example could be a coin flip. This simple measure says that there is a fifty percent chance of flipping a coin and getting either heads or tails as a result. However, if we flip this coin 5 times, we may encounter all 5 times being heads or tails. A better population measure is to flip the coin a hundred times. Probability never fails and we will encounter a percentage close to fifty percent maybe not exact.
Another problem that could occur is externalities. Surveys could differ in demographics, such as: region you live in, sex, age, heritage. These variables can all alter the way a survey is conducted. Let’s take State College. A question can be conducted, “Do you think illegal file sharing is ok?†We may get a sample of eighty participants and eighty percent say yes. However if we were to conduct a survey in Washington DC, we may get the reverse results. Externalities such as time constraints, truthfulness will happen. But the larger the sample the less these abnormalities will become a problem.
Another resolution is with a statistical method called regression. An example, we want to find the reason MLB players are being paid so high. So we look at data such as their points, but is this valid? Points has a positive correlation with salary? These maybe true, but other factors such as rebounds, steals, and other statistics have an influence. Other factors such as draft pick, family, race, size, jail time, and intelligence, can all have influence on their pay. With regression, we combine all this information and can see how much each influence has on their salary. We can for instance find out that scoring does have a sixty percent influence on the pay. This means that it is a huge impact on salary but doesn’t explain everything. On the other end, jail time is placed into the variables and we find out that it has a .008% influence on the salary. The weight is so small that we can actually not include this into factors that determine salary. Regression is a method useful for determining valid data from the stuff that doesn’t really matter.
Being an economist, we deal with statistics everyday. Most of our data relies on a density of information that we use to compile results. But with information you will always have a mean, and deviation from the mean. What we want to do is to get enough participants so that we can narrow this deviation from the mean. An example could be a coin flip. This simple measure says that there is a fifty percent chance of flipping a coin and getting either heads or tails as a result. However, if we flip this coin 5 times, we may encounter all 5 times being heads or tails. A better population measure is to flip the coin a hundred times. Probability never fails and we will encounter a percentage close to fifty percent maybe not exact.
Another problem that could occur is externalities. Surveys could differ in demographics, such as: region you live in, sex, age, heritage. These variables can all alter the way a survey is conducted. Let’s take State College. A question can be conducted, “Do you think illegal file sharing is ok?†We may get a sample of eighty participants and eighty percent say yes. However if we were to conduct a survey in Washington DC, we may get the reverse results. Externalities such as time constraints, truthfulness will happen. But the larger the sample the less these abnormalities will become a problem.
Another resolution is with a statistical method called regression. An example, we want to find the reason MLB players are being paid so high. So we look at data such as their points, but is this valid? Points has a positive correlation with salary? These maybe true, but other factors such as rebounds, steals, and other statistics have an influence. Other factors such as draft pick, family, race, size, jail time, and intelligence, can all have influence on their pay. With regression, we combine all this information and can see how much each influence has on their salary. We can for instance find out that scoring does have a sixty percent influence on the pay. This means that it is a huge impact on salary but doesn’t explain everything. On the other end, jail time is placed into the variables and we find out that it has a .008% influence on the salary. The weight is so small that we can actually not include this into factors that determine salary. Regression is a method useful for determining valid data from the stuff that doesn’t really matter.
The world today is so technologically advanced. We have all these tools and programs, like databases, that allow us to gather and process information easily and quickly. However, no matter how advanced our technology has or will become, there will always be some problems with these technologies.
The DBMS, database management system, is a great tool. It allows users to do many things and keep track of information including: bookkeeping information, customer and inventory information, mailing lists, and personal records. They can also keep track of tax information and surveys. Well, so you might think that all of this is good and there would be no problems. However, there are problems with database systems.
So what exactly are these problems? Well, the databases themselves cannot account for information that has been altered or is incomplete. Most databases will process the information that is entered by the user; the database itself has no way to account for biases that people may have had during data collection. Databases just assume that all of the information is correct and that the user intended to enter it in the way he or she did.
When we get information from reports, journals, and books it is hard to be certain what information is one hundred percent accurate and what information is made up. This is especially seen when using the WWW to do research. I have done Google searches so many times and came up with junk that didn’t have a thing to do with what I was researching. One of the biggest biases comes from surveys, as people tend to stay within their own scope of friends and acquaintances, resulting in responses from people who feel the same way as the surveyor. This will almost always sway results one way or another.
As knowledge workers, it is our job to sort through all of the information we gather and receive. We are ultimately responsible for feeding the proper or, unfortunately, improper information into each database. It is general knowledge that we will have a better understanding if we are knowledgeable to the subject matter and we know exactly what we are looking for. This is yet another problem, as not everyone knows what they are looking for. If we all knew what we were looking for, we could find and rid of the false information and obtain more meaningful information.
In order to extract the meaningful data, we need to rid of the false and biased information. However, I truly believe that we will always have problems with databases because inaccurate or made up information and biases will always exist.
In every organization, data is a key component to effective decision making. Thus, it’s pivotal that this raw data is entirely accurate so that appropriate meaning can be drawn from it. However, considering the vast amounts of data that organizations today must collect, assuring its validity can be a complex matter — especially when this data is human-generated.
When dealing with human-generated data, numerous factors must be taken into consideration when attempting assure its accuracy. Particularly in surveys, the respondent’s truthfulness, bias and desire to participate must be taken into account. The respondent could just be responding with what he or she thinks the interviewer wants to hear, or they could deliberately be trying to
throw off the survey. If extrapolation based on the data is planned, one must also consider just how representitive this data is of the population being analyzed and assess any margin of error. Some examples of this type of data seen every day in the news are political surveys. Professional surveys must estimate the percent error associated with their analyses. Though, Human-generated data is by no means the only type of data that can present difficulties in this area.
Another type of data that can be confusing to determine the accuracy of, and one which my team will be depending on greatly in our project, is written material — specifically material found via the internet. When dealing with the internet in particular, a large amount of invalid information can be found, and it’s often presented in a manner which seems convincing and legitimate. It will be very important for all of my team members and me to only gather data from reputable sources, and to always document our sources for later reflection and citations. After assuring the validity of our data though, it will be equally important to organize it efficiently.
If data isn’t organized effectively, problems even more significant than the quality of the data can arise. It doesn’t matter how vast an amount of data an organization has if it isn’t entered and organized properly because it can’t be accessed and used in a timely fashion. This is where databases and Database Management Systems come into play. These tools help organizations and individuals to manage large amounts of data, compare certain sets of data to other ones, and to make this data available to specific users. Choosing the best type of DBMS for one’s specific goals is another important decision that needs to be made.
There are many potential problems that can arise when analyzing and organizing data, but when information workers are careful and make the right decisions, the results can enable organzations and individuals to make important decisions in an informed and timely manner. Many of the topics discussed on this thread will have to be considered carefully by my team when working on our project. Topics such as how to use databases most effectively and doing further research to determine the validity of authors and data we find in books or online will come up numerous times throughout our work and throughout our futures in the information industry.
When extracting meaninful data, some problems can arise. Athough, with careful oraganization and work, I believe it is possibly to only take valid and reliable data to be sure your results are correct and meaningful.
When trying to gather meaningful data, you must be careful so that the information you are gathering is true and complete. If you’re conducting a survey, you must be sure your questions are easy to understand so responses are consistent and that the person or group you are questioning is a reliable and valid source. If these guidelines aren’t followed, you may run into problems with the truthfulness of your results. Also, you must be sure the data you’re collecting is complete and of good quality. When entering the data you have recieved into a database, you must be sure everything is correct. If it is not, like if a number is entered incorrectly or a number is left out completely, your results will not be valid and may throw off the final results you are trying to obtain. Also, I believe the amount of data you gather can cause a problem, too. If you gather a large amount of data, there is more room for error when entering the data into a database, but if you don’t gather enough, then your results may not represent an entire group or population. You must find a happy medium where you can obtain valid results with little room for error.
To be sure you’re collecting only the meaningful data, you must be very careful when collecting your information. Be sure your subjects are reliable and honest. You must also be very careful entering this data into a database so there are no errors and your results are correct. Also, you must collect enough information to produce meaningful results, but not too much so that there is a lot of room for error.
Extracting meaningful data and avoiding errors can be very difficult. But, I believe with careful planning and consideration, producing meaningful results can be accomplished. Following some simple guidelines and carefully monitoring the data and how it is collected can be a very helpful and a huge step in producing the results and data you want.
Addressing the challenge of data manipulation it is important to not blindly use raw data, but understand the potential problems with databases, how data is collected and maintained.
Some of the problems listed in topic 7 included human errors, software & hardware failures, theft, access security, loss of personal data, confidentiality, sabotage, loss of backups, loss or money and on. The issues of security and protection will always be a hot area of those functioning in the digital age. I think it is important when using a database that one understands how the data is also if protect or is secure. This understanding will lead to insight in the potential problems might be inherent to the data.
What is the source? How was the data collected, as with any research is vital to know if the output or your extraction is relevant. Or to what extend is the information useful. How the data is maintained is another related issue. Is there a disaster recovery plan, how is it backed up, how often. Is it possible to loss information between back-ups?
For data to be considered in a useful context, the user must understand the limitation of each database in how information is collected, maintained, extracted, and what problems could impact each area.
There are many problems when trying to collect data from people using surveys or other human forms. First of all there is always human error. The survey may have questions that are worded in a way that causes confusion. Questions need to be simple and precise so that the objective is clear. Aside from human error on the part of the survey makers there can also be human error on the part of the person entering the raw data into the DBMS. If you hire a data entry clerk to enter data(especially numerical data) and they enter something incorrectly by one number it can mess up the results.
Other problems with trying to extract meaningful data is that the surveyor or interviewer may be asking the wrong questions to get the information that they wish to receive. Almost all of the planning needs to go into how to extract the correct raw data and how to best ask the questions to get that data. Also in order to extract meaningful data you need to make sure that you are asking the right people and that you are asking a large enough population so that outliers will not greatly effect the results of the study.
When my group and I try to collect data for our project we will make sure that we spend a lot of time on the planning portion of our survey and DBMS. We will make sure that the questions are simple and clear and we will make sure that we will ask people that relate to our questions(we wouldn’t want to ask 85 year olds about the topic since many don’t d/l). We will also make sure that we are very careful when we enter our data so that our result will be meaningful information.
“There are three kinds of lies, lies, damnable lies, and statistics.”
Much of what has been discussed so far has related to surveys and polls, but I will take a closer look at the professional world of data collection and interpretation. Luckily for the business world, it is rare to see lies seep into a database. Most of the information gathered is based on actual numbers, such as the amount of sales made by a certain employee, or the number of jobs done for a specific client. This lack of allowance for “stories†leaves space only for user error while inputting the data, or does it?
As the example in the reading introduction states, certain other factors can create problems for data gathering. The given example is of an insurance company that cannot store all its client information in one database because different branches use different customer codes. Another such problem was the Y2K bug. If a company stores database information using only a two digit number for the year, after 1000 years a repeat in numbers will occur. This creates the problem of creating reasonable information out of the data. If a company wants to do a study on how many sales they made in 3001 and the information stored goes as far back as 2001, data will turn up from 2001 as well if only two digits are used for the date. This is not much of a problem nowadays, since the big Y2K scare already passed without much hassle.
A bigger problem for data use in the business world is how the data is recovered. A company must be able to properly recover the data they need when they need it. If a company wants to decide what products to feature during certain times of the year, they can data mine to determine what items were sold at what times of the year most. But if their DBMS sucks, they will be unable to perform a good search that yields usable and reliable results.
In the business world, the source of data is not nearly as important as how the data is made accessible.
The storage, organization, and retrieval of data is a vital part of our every day lives. Without us probably even knowing, our information is stored on several different databases. However, it takes a great amount of time to organize and understand the data. Unfortunately, there are many things that can lead to faults in data and incorrect results.
Some information is collected from an inappropriate sample size. If a newspaper was to report that 80% of the United States like chocolate ice cream over vanilla, we would be assuming that they polled thousands of people. But if the sample size was 100 people, than obviously stating that 80% of the US likes chocolate ice cream is not accurate. Although it may be a true statement, there is not enough evidence to back it up.
Of course, there are biases that can easily occur. There could be experimenter biases or even participant biases. Experimenter bias occurs when experimenters “find†what they expect to find. They are so sure of what the outcome will be, they may be oblivious to the fact they may be wrong. As well, if a survey is conducted, people may respond differently if they know the demands of the study.
Sometimes even random errors occur. For example, if you take a sample of a variable population, by chance the sample may not adequately represent the real population. But as mentioned before, if you increase the sample size, you decrease the chance of error.
I think that there are many problems that can affect the validity of data. First, you have to make sure you take a representative sample. This not only includes taking ENOUGH data, but also taking data randomly from a sample that would likely reflect the overall population. I think that if you do not do this, you in essence, poison the sample and render it useless. Another problem that could arise and make data virtually useless is timeliness. You must take data that is relevant to the current time in which you are working. In terms of relating this back to our project, taking data from court cases, interviews, or other such sources that are from the mid 90’s has very little relevance for a problem that has evolved until today. If you couple that with the possible bias that is sometimes involved, you have yourself a very unreliable source and thus very unreliable data. A final problem that I can think of is incomplete data. Data extracted from ANYWHERE must be complete. IF the data is not complete than you risk misconstruing an entire basis of argument.
I think that in order to avoid extracting meaningless data and using it in our problem assignment, we must be sure that we not only take data, but that we take it FROM BOTH sides of the equation. It is important, in order to eliminate bias , that we try hard to find facts presenting both sides of the problem. BIAS is something that occurs in our research even if we don’t intend it to and I think that if each person makes a conscious effort to seek both sides of an argument, and then run it by their other group members, the data in each of our papers will be more valuable and worthwhile to the rest of the population.
Referring back to topic seven for a moment, even though a lot of it dealt with computerized databases, a case can even be made that OVERALL, each member of a group has to DATABASE manage because if they don’t, our data and information could become shoddy and unreliable. Database management is a must not only for the computer society, but I think that our society as well because we engage in personal database management everyday and those skills affect the data we plug through in our heads and that eventually come out in our research.
Any problem that arises from extracting data is most likely contributed to human error. It is near impossible to ever collect 100% correct data. In statistics we can never be 100% certain that the conclusions we have made are correct so we simply quantify the data and then use it to make an educated guess. Problems can arise in any number of the processes. When collecting data people will have biases toward thier own opinion, they may not give truthful data, or they may not understand the question and therefore give the wrong answer. When putting the raw data into a single database, it is easy to mistype or misread the information. When quantifying this raw data it is easy to misinterpret the calculations or count. Throughout all of the processes infinite number of problems can arise. It is common to turn useful data useless.
There is not really any progress which can be taken where no errors will arise. It is human trait to make an error no matter how many times the work is checked. In order to extract data that is valid and reliable we must check over our work many times. We must make the beginning questions very easy to understand and eliminate any biases. They will need to emphasize the importance of this survey so as to maintain truthfulness throughout. As well they should make sure that the raw data is transferred safely so as to eliminate any tampering of it. Any number of things may cause the data to become useless.
Businesses thrive on the information they gather from collected data whether it be used for accounting, economic, or other purposes. In order for the business to perform efficiently the data must be interpreted to the utmost precision and accuracy. With faulty or incomplete data, a conclusion from the dataset cannot be reached. Thus it is vital for those who interpret data to do so without error or bias.
Misinterpreting data leads to failure for businesses as well as failure to obtain the truth from the collection of data. In order to extract only the meaningful data, one must disregard incomplete and other forms of faulty data. In addition, one must see the data that is complete in an objective light. Personal bias plays a huge role in how the data is concluded. Yet despite this problem, it still occurs a great deal in the world.
I had a microsoft certification class and we worked alot with excel. This was actually a main problem we had during the class because we had to take surveys and they were never correct because people lied. A good way to fix this the creator of the database needs to make the questions as clear as possible and ask as many people as possible so they can get a correct survey. Even after the survey is given you should give safeguards just in case the user messes up on easy instructions. If a use messed up a pin code for an account the computer can go back and check to make sure this is the right number and give them another chance to enter in the correct number.
There is always the problem of someone just entering in the wrong character but thats a risk of databases. Someone can enter something wrong for one thing and mess up the entire database so the best way to do is just check again and again to make sure everything is correct.
If the survey is given to a bias person or a person you know that lies you should not give them the survey. Thats common sense basically, you would not want to give a biased person a survey unless your trying to get results like that.
Integrity is the key and if you survey enough people usually most people have enough integrity to awnser ur questions correctly.
Ok, lets talk research paper. Your resources (data) are key to making everything mesh. Now, when it comes time to research your data, you have to know the difference between the truthful and the bad. Basically, you need to research your research. Find a piece of data that you feel may be accurate, then search how respectable sources to see how their opinions’ match up. Keep in mind that your resources may even be wrong, but it’s a chance that you’ll have to take.
So now that you have lots of research for your paper, you need to pick out what you need and what you don’t. Make an outline (Project Goals) and see how your research compares to your goals.
Finally, you have to follow that outline of yours so that you can get your resources and your idea (the project as a whole) to work well.
—