blah blah blah is here! blah blah » Close

up0down
link

I want to manually read in a CSV file and use the split to load it into a data table. However, I need to not Split if it is within "". There may be an easier way to do it (if so please let me know)

but I'm reading the CSV files into an array of lines.

string[] Temp = File.ReadAllLines(FileName.Text);
char[] delimiter = new char[2];
delimiter[0] = ',';
string[] HeaderRow = Temp[0].ToString().Split(delimiter);
lDT.Reset();
foreach(string header in HeaderRow)
{
lDT.Columns.Add(header);
}
for (int i = 1; Temp.Length > i; i++)
{
DataRow row = lDT.NewRow();
string[] csvRow = Temp[i].ToString().Split(delimiter);
for (int j = 0; csvRow.Length > j; j++)
{
row[j] = csvRow[j];
}
lDT.Rows.Add(row);
}

The problem I have is that there are some values such as "1,234.56" that are throwing it off. So I need to ignore the Delimiter if it is in quotes. Any suggestions (either to fix the issue or a better way to pull the data?)

last answered one year ago

1 answers

up0down
link

What I'd do is to write a routine which removes the embedded commas, then remove any surrounding double quotes and finally do the split on the remaining commas. The code would be something like this:

using System.Text;

// ...
for (int i = 1; Temp.Length > i; i++)
{
DataRow row = lDT.NewRow();
string[] csvRow = RemoveEmbeddedCommas(Temp[i]).Replace("\"","").Split(delimiter);
for (int j = 0; csvRow.Length > j; j++)
{
row[j] = csvRow[j];
}
lDT.Rows.Add(row);
}

// ...

static string RemoveEmbeddedCommas(string s)
{
StringBuilder sb = new StringBuilder(s);
bool embedded = false;

for(int i = sb.Length - 1; i >= 0; i--)
{
if (sb[i] == '\"')
{
embedded = !embedded;
}
else if (sb[i] == ',' && embedded)
{
sb.Remove(i, 1);
}
}
return sb.ToString();
}

If you want to preserve the embedded commas in the database entries, then instead of removing them, replace them with a character which you know won't be used in the file - say (char)255.

Having done the split, all you need to do then is to replace the (char)255's with commas.

thanks I was hoping for a simpler solution but since that does not appear to be available this is a great idea for a fix. (BTW I'm using the replace suggestion)

Feedback