In: Computer Science
How to do in C++
HTML Converter
Create a program that reads an HTML file and converts it to plain text.
Console
HTML Converter
Grocery List
* Eggs
* Milk
* Butter
Specifications
Your instructor should provide an HTML file named groceries.html that contains these HTML tags:
<h1>Grocery List</h1>
<ul>
<li>Eggs</li>
<li>Milk</li>
<li>Butter</li>
</ul>
When the program starts, it should read the contents of the file, remove the HTML tags, remove any spaces to the left of the tags, add asterisks (*) before the list items, and display the content and the HTML tags on the console as shown above.
Note
The groceries.html file ends each line with a carriage return character (‘\r’), not a new line character (‘\n’). To account for that, when you read the file, you can use the third parameter of the getline() function to specify the end of the line. For more information, you can search online and check the documentation of the getline() function.
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
int main()
{
string line;
fstream myfile;
myfile.open("file.html", ios::in); // Read Html file
if (!myfile) // if file does not exits show error
cout << "file cannot open!";
bool inside = false; // initialize a boolian value inside as
false
int i=0;
while (getline(myfile, line)) // Loop through each line of the
file
{
string result = ""; // define an emplty string
result
for (int c=0;c<line.length();c++) // loop through each character
of the line
{
// check for
html tags
if (line[c] == '>')
inside = true;
else if (line[c] == '<')
inside = false;
else if (inside && line[c]!='\t')
result +=line[c];
}
if(i<=0) //print the first line as it is.
cout<<result<<'\n';
else if(i>1) // print rest of the lines with an
asterisks(*)
cout<<"* "<<result<<'\n';
i++;
}
return 0;
}